AT&T Labs - Research
AT&T  
Labs image
AT&T Labs - Research

The DCD Library is a software collection for speech recognition decoding and related functions. Based on the Finite-State Machine (FSM) Library, it provides higher-level operations needed specifically for decoding. These include algorithms that:

  1. make an optimized recognition transducer (``network'') from a grammar, a lexicon, and a context-dependency specification
  2. search a recognition transducer to find the best matching path or a set of paths (``lattice''), given an input utterance and an acoustic model
  3. weight lattices with just their grammatical and acoustic components
  4. align reference and hypothesis automata using edit distances (``scoring'')
This library is not intended to provide a complete speech recognition system: there is no acoustic model, no acoustic feature generator (``front-end''), and no acoustic, lexical or grammatical training provided by this package; all are considered outside the scope of this library. It does provide a general, extensible acoustic model interface, it accepts general context-dependency, lexical and grammatical models encoded as finite-state transducers, and it reads pre-computed acoustic features in a variety of formats. Our emphasis is on the generality, flexibility and performance of the operations provided here, which are to be used as key algorithmic components of an automatic speech recognition (ASR) system. Several of the general operations should also find application outside of speech recognition.

There is a program-level set of executables that is meant for use in ``batch'' speech recognition training and testing. There is a C++ library-level set of routines that is meant for easy integration into an ASR system. The components are quite modular and independent, so the user can select which functionality they require.