SYNOPSIS
       dmake [ -a fsm | -b fsm | -n fsm | -l l | -f hmms -c c | -mv? ] ...

       dweight [ -opts ] [ fsm_archive ]

       dalign [ -opts ] ref_fsm_archive hyp_fsm_archive

DESCRIPTION
       These commands are for use with the DCD  finite-state  transducer-based
       recognition system.

       dmake  generates  an  optimized  recognition  transducer from component
       transducer models. Here is an example:

       dmake -mv -a hmm.fsm -b cntx.fsm -a lex.fsm -b gram.fsm >hclg.fsm.

       In this example, the resulting  transducer  hclg.fsm  is  an  optimized
       recognition   transducer  mapping  sequences  of  HMM  states  to  word
       sequences constructed from the grammar automaton gram.fsm, the  lexicon
       transducer lex.fsm, the context-dependency transducer cntx.fsm, and the
       HMM state-level transducer hmm.fsm.

       The flags that take FSM arguments specify  the  component  transducers,
       which  will  first  be pre-processed (as described below) and then com-
       posed and determinized right-to-left.  The particular flag used  speci-
       fies the type of component transducer and how it will be pre-processed.
       The -a flag declares the FSM argument is  an  acyclic  transducer  that
       will be disambiguated and closed. A typical such input would be a lexi-
       con transducer that maps phone strings to word labels and their associ-
       ated  costs  or  an  HMM  transducer that maps HMM state strings to HMM
       (context-dependent phone) labels. The canonical form of  this  kind  of
       transducer  has  the  output  labels  and costs as close to the initial
       state as possible. If the input is not in canonical form,  it  will  be
       preprocessed to be so.  The -b flag declares the FSM argument to be bi-
       determinizable, i.e., both the FSM and its inverse  are  determinizable
       (where  epsilons  are treated as regular symbols). A typical such input
       would be an n-gram grammar (e.g., as produced by the AT&T GRM  library,
       grmintro(1))  or  a  context-dependeny transducer (e.g., as produced by
       the AT&T AMTOOLS library, amintro(1)).  The canonical form of this kind
       of  transducer  is deterministic wrt the output labels. If the input is
       not in canonical form, it will be preprocessed to be so.  The  -n  flag
       can be used to pass an arbitrary component FSM. Note dmake may not ter-
       minate on some inputs of this type. A sufficient condition for termina-
       tion  (given  enough resources) is that the input be determinizable.  A
       typical use of this command, as in the example above, is to produce  an
       HMM state-level optimized recognition transducer. If the last component
       transducer is stochastic (probabilities sum to 1 at each state) and  if
       the   remaining  component  transducers  are  conditionally  stochastic
       restricted to a given output string, the probabilities sum to 1 at each
       often superior in real-time performance to:

       dmake -mv -b cntx.fsm -a lex.fsm -b gram.fsm >clg.fsm.

       Neither are typically as good as fully  optimizing  to  the  HMM  state
       level.

       The  -m  flag  to  dmake minimizes the (encoded) intermediate and final
       results. The -f flag factors the recognition transducer by finding lin-
       ear  chains of transitions, identifying each unique chain as an HMM and
       outputting it to the HMMs specification filename argument, and  replac-
       ing  those chains in the recognition transducer with single transitions
       whose input labels are the corresponding HMM IDs. The -l flag sets  the
       maximum  chain  length  (default:  unlimited), and the -c flag sets the
       maximum number of chains (default:  unlimited).   The  -v  flag  prints
       progress information to standard error.

       dweight  modifies  the weights on each FSM in the input fsm_archive. If
       the flag -p is used, the weights are pushed to form a probability  dis-
       tribution  over  transitions.   If  the flag -r weight_fsm is used, the
       weights in weight_fsm are used to replace the weights on matching paths
       in  the  input FSM. A common use of this flag is to replace the weights
       on word lattices entirely with the weights from a word grammar.  If the
       flag  -s weight_fsm flag is used, the in weight_fsm are subtracted from
       the input FSM. A common use of this flag is  to  subtract  the  grammar
       used  in recognition from lattices containing both acoustic and grammar
       weights so that the resulting lattices  will  have  just  the  acoustic
       scores  on them. One of the above flags is mandatory.  These operations
       use determinization, which may use considerable time and space for some
       inputs.  Pruning  can  help  in this matter: the -c flag can be used to
       prune the determinization to a particular cost threshold; the -m m  and
       -b  b  flags  limit  the  determinization  to  nstates*m+b states where
       nstates is the number of states in  the  input  FSM  (see  fsmprune  in
       fsm(1)) for more information).

       dalign returns an alignment FSM for each input reference and hypothesis
       FSM.  By default, all alignments are returned. With the -n  flag,  only
       the  n-best alignments are returned. With the -t flag, cumulative scor-
       ing statistics are additionally returned (to standand error). With  the
       -T  flag, per alignment and cumulative scoring statistics are addition-
       ally returned (to standard error). With the -p flag, percent statistics
       rather  than  absolute  counts  are printed. The -s flag determines the
       algorithm used. With -s tropical, an efficient algorithm  is  used  for
       the case of unweighted input FSMs (if weights are present in this case,
       they are ignored). With -s log, the more general case of weighted input
       automata is permitted (not yet implemented).  The -S, -D, and -S can be
       used to set the substitution, deletion and  insertions  costs,  respec-
       tively.

CAVEATS
       On non-POSIX compliant systems, binary data directed to standard output
       may become corrupted. A command-line argument of ''-o'' can instead  be
       fsm(1)                             FSM user commands.
       fsmaccess(3)                       FSM C accessors.
       far(1)                             FSM archive user commands.
FILES
       /n/lvr/linux/bin/dcd-2             Distribution binaries.
       /n/lvr/linux/src/cmd/dcd/dcd-2     Distribution sources.
       /n/lvr/linux/include/dcd-2         Distribution DCD include files.
       /n/lvr/linux/lib/libdcd-2.a        Distribution DCD library.
AUTHORS
       Cyril Allauzen (allauzen@research.att.com)
       Mehryar Mohri (mohri@research.att.com)
       Michael Riley (riley@research.att.com)

       Copyright (C) 2003 AT&T Corp. All rights reserved.



Version 2.0                                                          DUTILS(1)