fsmconnect,  fsmdeterminize,  fsmpush, fsmminimize, fsmencode, fsmquan-
       tize, fsmequiv, fsmrmepsilon, fsmsynchronize, fsmepsnormalize, fsmbest-
       path,  fsmprune,  fsmconvert, fsmarcsort, fsmtopsort, fsminfo - finite-
       state machine utilities


SYNOPSIS
       fsmcompile [ -i symbols ] [ -o symbols ] [ -S symbols ] [  -t  ]  [  -f
       fsmclass ] [ -s semiring ] [ file ]

       fsmprint [ -i symbols ] [ -o symbols ] [ -s symbols ] [ -l ] [ fsm ]

       fsmdraw  [ -i symbols ] [ -o symbols ] [ -s symbols ] [ -w x ] [ -h y ]
       [ -f font ] [ -F n ] [ -lpv ] [ fsm ]

       fsmunion [ fsm1 ... ]

       fsmconcat [ fsm1 ... ]

       fsmclosure [ -p ] [ fsm ]

       fsmreverse [ fsm ]

       fsminvert [ fsm ]

       fsmproject [ -io12 ] [ fsm ]

       fsmintersect [ -c cthresh ] [ -n nthresh ] [ -1d ] fsm1  fsm2  [   fsm3
       ...]

       fsmcompose  [  -c  cthresh  ]  [ -n nthresh ] [ -1d ] fsm1 fsm2 [  fsm3
       ...]

       fsmdifference [ -c cthresh ] [ -n nthresh ] [ -1d ] fsm1 fsm2

       fsmconnect [ -t ] [ fsm ]

       fsmdeterminize [ -c cthresh ] [ -n nthresh ] [ fsm ]

       fsmpush [ -clifrIkd ] [ fsm ]

       fsmminimize [ fsm ]

       fsmencode [ -cld ] fsm key

       fsmquantize [ fsm ]

       fsmequiv [ -v ] fsm1 fsm2

       fsmrmepsilon [ -c cthresh ] [ -n nthresh ] [ -1d ] [ fsm ]

       fsmsynchronize [ -c cthresh ] [ -n nthresh ] [ -d ] [ fsm ]

DESCRIPTION
       These   commands  construct,  display,  combine,  minimize  and  search
       weighted finite-state machines (FSMs).  Two types of machines are  sup-
       ported,  acceptors and transducers. See fsmaccess(3) for definitions of
       the components of an FSM: state, start state, arc  (transition),  input
       symbol, output symbol, arc cost, final cost and final state.

       Except  as  described  for the commands that compile, print and display
       FSMs, all the commands take as input one or  more  binary-encoded  FSMs
       and  send  to standard output one binary-encoded FSM. For commands that
       take one input FSM, the FSM can be given as standard input or on a file
       specified  as a command-line argument. For commands that take more than
       one FSM, all inputs are specified by command-line arguments. A command-
       line  argument  of  ''-''  specifies that the corresponding FSM will be
       provided on standard input.  A command-line argument of -?  will  print
       usage information.

   FSM COMPILATION AND DISPLAY
       fsmcompile  takes  input representing an FSM from file file or standard
       input, and sends to standard output its  binary  encoding.   The  input
       should  be  the  textual  representation  of  an FSM (see fsm(5)).  FSM
       states, input symbols and output symbols are represented in  the  input
       by  non-negative numbers, unless the options -S symbols, -i symbols, -o
       symbols are used. These options allow state, input symbols  and  output
       symbols,  respectively,  to be given textual names, where symbols files
       give the translation from those names and numbers (see   fsm(5)).   The
       input  should  be  an acceptor, unless the -t option is given, in which
       case it should be a transducer.  The -f option specifies the FSM repre-
       sentation  class  (see  fsmclass(3));  by default, it is FSMBasicClass.
       The -s option specifies the cost semiring (see fsmcost(3)); by default,
       it is SMRTropical.  Note fsmcompile will internally renumber the states
       from 0 to the number of states minus 1.

       fsmprint prints the input FSM on standard  output  using  same  textual
       format as fsmcompile accepts as input (see  fsm(5)). States, input sym-
       bols and output symbols are printed in numeric form, unless the options
       -s  symbols,  -i  symbols, -o symbols are used to provide textual names
       for  states,  input  symbols  and  output  symbols,  respectively  (see
       fsm(5)).   Note  that  the  states may be renumbered by  fsmcompile  or
       other operations.  With the -l option, additional information is output
       that  when  read  by fsmcompile preserves the state potentials, the FSM
       representation class, the cost semiring, and whether it is an  acceptor
       or transducer.

       fsmdraw  sends  to standard output a dot(1) graph representation of the
       input FSM (The command dot -Tps can be used to convert from dot  format
       to PostScript.)  States, input symbols and output symbols are displayed
       in numeric form, unless the options -s symbols, -i symbols, -o  symbols
       are  used to provide textual names for states, input symbols and output
       symbols, respectively (see  fsm(5)). The options -w x and -h x set  the
       page  width  and  height  (in  inches),  -f fontname sets the font name

       fsmreverse reverses the input FSM,  that  is,  the  input  FSM  accepts
       string  a1 ... aN with a certain cost iff the output FSM accepts aN ...
       a1 with the same cost (and similarly for transducers).

       fsminvert inverts a transducer,  that  is,  the  input  FSM  transduces
       string  s1 to s2 iff the output machine transduces string s2 to s1 with
       the same cost.  It does so by tranposing the input and  output  symbols
       on each transition.

       fsmproject converts a transducer into an acceptor by retaining only the
       input (with -i or -1) or output (with -o or -2) label on  each  transi-
       tion.

       fsmintersect  returns  the intersection of two or more acceptors.  Each
       input FSM accepts string s iff the output FSM accepts s with the  costs
       combined by the SMRTimes operation (see fsmcost(3)).

       fsmcompose returns the relational composition of the input FSMs, in the
       order given in the command line. With two input FSMs, for  example,  if
       the  first  machine  transduces  string s1 to s2 and the second machine
       transduces s2 to s3, then the output machine will transduce  s1  to  s3
       with the two costs combined by the SMRTimes operation (see fsmcost(3)).
       If an input machine is an acceptor, it is treated as a transducer  from
       the language it accepts to itself.

       fsmdifference  returns  the  intersection of the acceptor fsm1 with the
       complement of the costless, deterministic, epsilon-free acceptor  fsm2.

   FSM MINIMIZATION AND EQUIVALENCE
       Two  acceptors  are equivalent if they accept the same strings with the
       same costs; two transducers are equivalent if they transduce  the  same
       input strings to the same output strings with the same costs.

       fsmconnect  returns an equivalent FSM from which any states and arcs in
       the input that do not lie on a path from the start  state  to  a  final
       state  have  been removed. With the -t option, it returns exit status 1
       if the output has no states, which is useful for testing the input  for
       emptiness.

       fsmdeterminize  returns  a  deterministic FSM that is equivalent to the
       input, which must be determinizable.  Epsilon arcs are treated the same
       as other symbols.

       fsmpush  returns  a  pushed  FSM  equivalent  to the input. With the -c
       option, the topology of the input FSM is unchanged and the  SMRPlus-sum
       of  the costs of the outgoing arcs (with -i) or incoming arcs (with -f)
       at each state equals SMROne. By default, the residual  cost  (the  SMR-
       Plus-sum  of costs of all complete paths) is placed final -- the origi-
       nal final costs are SMRTimes-multiplied by this residual cost. With -I,
       the  residual  cost  is  instead placed initial -- the cost of each arc
       leaving the initial state is SMRTimes-multiplied by this cost. With the
       over the alphabet of pairs of input labels and output labels by  encod-
       ing  each distinct input label and output label pair of an arc as a new
       label in the output FSM.  fsmencode with the  -c  option  represents  a
       weighted FSM as an unweighted one by encoding each input label and cost
       pair of an arc as a new label in the output FSM.  With the -cl  options
       together,  weighted transducers can be represented as unweighted accep-
       tors.  In each case, the mapping from each input label and output label
       and/or  cost  to  its encoding is written as an FSM to the key file. If
       the key file already exists, fsmencode will  first  use  the  encodings
       specified  in  the  key  fsm for any matching arc labelings.  fsmencode
       with the -d option, decodes an encoded FSM if the same set of -l and -c
       options and key FSM is given as used in its encoding.

       fsmquantize  replaces  each  cost in an FSM with the nearest element in
       its delta-quantization (see fsmcost(3)).

       fsmequiv exits with zero status if fsm1 and fsm2 are  equivalent.  With
       the -v option, a text message is printed if the machines are not equiv-
       alent. The inputs must be deterministic, epsilon-free acceptors.

   FSM EPSILON OPERATIONS
       fsmrmepsilon returns an FSM equivalent to the input FSM that is epsilon
       removed.  For an acceptor, this means there are no epsilon transitions.
       For a transducer, this means there are no paired  input-output  epsilon
       transitions.

       fsmsynchronize  returns an FSM equivalent to the input FSM in which the
       input and output labels are synchronized.  For a transducer, this means
       as  any  path  is  traversed,  its  delay  is  either zero or increases
       strictly monotonically -- the delay of a path is the difference between
       the  number  of  non-epsilon  labels encountered on the output side and
       those encountered on the input side.  The input FSM must  have  bounded
       delays,  that is the delay of any cycle must be zero. All acceptors are
       already synchronized. The worst case time and space complexity  of  the
       algorithm  is  O(O((|Q| + |E|)(|A|^|d| + |B|^|d|))), where Q is the set
       of states of the input FSM, E its transition set, A its input alphabet,
       and B its output alphabet and |d| the maximum delay (in absolute value)
       of a path of the input FSM.

       fsmepsnormalize returns an equivalent FSM that is  epsilon  normalized.
       For  an  acceptor,  this is the same as being epsilon removed (see fsm-
       rmepsilon in fsm(1)). For a transducer, it is epsilon removed  and  has
       the  following  additional property.  With the -i or -1 flag, a transi-
       tion with an epsilon input label is never followed on a path by a tran-
       sition  with  an  non-epsilon input label.  With the -o or -2 flag, the
       corresponding property is true for the output labels.   The  input  FSM
       must be epsilon-normalizable.

   FSM SEARCH
       fsmbestpath  returns  the  lowest-cost path from the start state of the
       input FSM to a final state. The path is encoded as a (single path) FSM.
       With  the .I -n nbest option, the nbest lowest-cost paths are returned.
       and a particular semiring.  The -f option specifies the FSM representa-
       tion  class (see fsmclass(3)); by default, it is the same as the input.
       The -s option specifies the cost semiring (see fsmcost(3)); by default,
       it  is  the  same  as  the input.  Note the costs are not modified when
       changing semirings, they are simply reinterpreted in the new  semiring.

       fsmarcsort  sorts  the  arcs  of  the input FSM according to the option
       given: -i sorts by the arc input label, -o  sorts  by  the  arc  output
       label, and -c sorts by the arc cost. All sort from least to greatest.

       fsmtopsort sorts the input FSM according to the option given: -t does a
       full topological sort; -e topologically sorts  with  respect  to  (i/o)
       epsilons,  and  -i  topologically  sorts  with  respect  to  the  input
       epsilons.  In each case, if there is a cycle with respect to the  sort-
       ing criterion, fsmtopsort returns the input FSM unsorted.

       fsminfo  sends  to  standard output the following information about the
       input FSM -- its  FSM  representation  class,  its  cost  semiring  and
       whether  it  is an acceptor or transducer.  With the -n option, various
       numeric information is printed, including the number of states,  number
       of  transitions,  final states, epsilon transitions, strongly-connected
       components, accessible states, and co-accessible states.  With  the  -p
       option,  FSMProps (see fsmaccess(3)) is called on the input, which will
       return pre-computed information about the FSM, such as  whether  it  is
       cyclic,  costless,  non-negative,  or  deterministic.  If  pre-computed
       information about a property is not supported by the FSM class, a ''?''
       is  printed  for  it. With the -t option, values for all FSM properties
       are printed (by explicit tests run on the FSM  if  needed).  See  "fsm-
       props.h"  for  the  set of defined FSM properties.  With the -c option,
       the FSM class properties are printed, which include the FSM  operations
       supported  by  that class.  See "fsmprops.h" for the set of defined FSM
       class properties.  With the -q w  option,  quantiles  in  intervals  of
       width  w  are printed for various data including state in-degree, state
       out-degree, input label, output label, and arc cost. With the  -b  q1''
       option,  the quantiles begin at q1 (default: 0.0), and with the -e q2''
       option, the quantiles end at p2  (default:  1.0).   The  -v  option  is
       equivalent  to -tcn -q4.  With the -P flag, information is sent instead
       to standard error and a copy of the input is sent to standard output.

CAVEATS
       To allow limiting the size of their results,  fsmintersect,  fsmdiffer-
       ence,   fsmcompose  fsmrmepsilon,  fsmsynchronize  fsmepsnormalize  and
       fsmdeterminize each accept the -c cthresh option and -n nthresh,  which
       call FSMPrune (see fsm(3)) prior to output. For fsmrmepsilon, fsminter-
       sect, fsmdifference, and fsmcompose, if the -c  or  -n  option  is  not
       given,  then FSMConnect (see fsm(3)) is instead called on the output to
       remove any states that are not on a path from  the  start  state  to  a
       final state. However, if the -d option is given, FSMConnect will not be
       called. In any case, memory size is, by default, kept to a  mininum  by
       using  a  two-pass  expansion strategy.  With the '-1' option, only one
       expansion pass is used, which can halve the execution time  in  certain
       circumstances, but increase memory usage.

       Some  commands  take  additional  options  not  specified  here. The -?
       option describes the complete set of options available  for  each  com-
       mand.

BUGS
       On non-POSIX compliant systems, binary data directed to standard output
       may become corrupted. A command-line argument of -F can instead be used
       to redirect the output FSM to a specified file.

SEE ALSO
       fsmintro(1)                        Intro.   to  the  FSM  programs  and
                                          library.
       fsm(3)                             FSM C library.
       fsmaccess(3)                       FSM C accessors.
       fsmcost(3)                         FSM cost definitions.
       fsmclass(3)                        FSM class description.
       fsmobject(3)                       FSM object definition.
       fsm(5)                             FSM file formats.
       http://www.research.att.com/sw/tools/fsm
                                          FSM home page -- software,  documen-
                                          tation and references.
FILES
       /n/lvr/linux/bin/fsm-4             Distribution binaries.
       /n/lvr/linux/src/cmd/fsm/fsm-4     Distribution sources.
AUTHORS
       Cyril Allauzen (allauzen@research.att.com)
       Mehryar Mohri (mohri@research.att.com)
       Fernando Pereira (pereira@cis.upenn.edu)
       Michael Riley (riley@research.att.com)
       Copyright (C) 1998-2003 AT&T Corp. All rights reserved.



Version 4.0                                                             FSM(1)