SYNOPSIS
#include "dsearch.h"
typedef float Cost;
class DSearch {
public:
typedef enum { IGNORE, PREFER, REQUIRE } FinalStateMode;
DSearch(DSearchModel *model, Fsm netfsm, DSearchParams ¶ms);
~DSearch();
void reset(DSearchModel *model, Fsm netfsm, DSearchParams
¶ms); // reset with parameter update
int nextFrame(); // advance search module to next frame
void getResponse(DSearchResponse *res, DSearchParams::ResponseType
typ,
FinalStateMode final_state_mode = PREFER);
Fsm cutPoint();
DSearchParams::ResponseType responseType();
DSearchParams::SegmentLevel segmentLevel(); DSearchParams::Mod-
elLevel modelLevel();
};
class DSearchParams {
public:
typedef enum { ONEBEST, LATTICE } ResponseType;
typedef enum { ILABEL, OLABEL } SegmentLevel;
typedef enum { STATE, HMM } ModelLevel;
DResponseType response_type; // (def: ONEBEST)
DSegmentLevel segment_level; // (def: OLABEL)
DModelLevel model_level; // (def: HMM)
Cost beam; // state cost pruning threshold
Cost lattice_beam; // lattice path pruning threshold (def: beam)
Cost arcs_max; // arc count pruning threshold (def: INT_MAX)
bool full_lattice; // no approximation in lattice generation
(def: false)
int gc_period; // # of frames btwn garbage collection (def: 50)
vector<int> suppress_labels; // output labels that are mapped to
epsilon (def: none)
bool cut_label; // cutpoint only when olabel of incoming lattice
arcs match this
bool self_loop; // implicit self-loop on each HMM state (def:
true)
bool norm_costs; // normalize path costs by best path to current
frame
bool eps_prune; // prune epsilon arcs (def: true)
bool gram_prune; // prune /w grammar only (def: true)
int pin_bounds; // constrain response FSM state times (def:
INT_MAX)
};
drecog(1). They depend on the acoustic model interface specified in
dmodel(3).
DSearch is the search module class. Its constructor takes a model,
netfsm, and DSearchParams class as its arguments. The model is
described in dmodel(3). The netfsm is the recognition transducer
(''network''). Its input labels are acoustic model indices, its output
labels are typically words and its costs combine any grammar costs,
pronunciation costs, and insertion penalties, each interpreted as nega-
tive log probabilities. Thus unlike the drecog(1) parameter fsms, the
user must construct the full recognition transducer himself. For a fur-
ther restriction, see the Caveats section below. The reset member
function, which should be called between utterances, is passed the same
type of arguments as the constructor. If the values are different than
at construction, the search module is appropriately updated. The
nextFrame member function advances the search module to the next frame.
Note that this module's interface does not pass or otherwise constrain
the ''acoustic features''. Their structure and semantics are entirely
determined by the acoustic model interface (see dmodel(3)). The getRe-
sponse member function returns recognition hypotheses in the user-pro-
vided DSearchResponse class pointed to by its first argument. Its sec-
ond argument determines the response type for this call; it needn't
match the response type requested in the DSearchParams initialization.
Its final argument indicates how to treat paths that do and do not
reach the recognition transducer's final states; see the corresponding
parameter in drecog(1) for the full explanation. Note this member func-
tion may be called at any time during recognition. A partial hypothe-
sis, i.e., a response requested before the end of the utterance is
reached, can for example be useful for ''barge-in'' or for interactive
display. Note a partial hypothesis may be modified (and not just
extended) as further processing occurs. The cutPoint member function
truncates the internal lattice at a lattice state called the cut point
and returns the truncated portion as an FSM. A cut point is a non-ini-
tial state through which all lattice paths must pass. If the DSearch-
Params::cut_label parameter was unspecified (i.e., FSMNoLabel) on ini-
tialization, then the cut point is the state furthest in time that has
this path property. If cut_label was specified, then the cut point is
the earliest lattice state that has this path property and for which
the output label of all incoming arcs to that state match cut_label.
The FSM returned is valid until the next call to reset, getResponse,
cutPoint or ~DSearch. If no appropriate cut point can be found, cut-
Point returns 0;
Currently, the cutPoint member function requires that the response type
requested in the DSearchParams initialization to be ONEBEST.
DSearchParams is a parameter class used to initialize and reset the
search module. These parameters have exact analogues in the documented
parameters of drecog(1) passed via its first argument; further documen-
tation of these parameters will not be repeated here.
parameter). In particular, this transducer must have grouped input
labels (see fsmclass(3)), i.e., all transitions with the same input
label must be in a contiguous block and the block with the zero label
must be first. The FSM classes FSMInputGroupedClass and FSMInputIn-
dexedClass are two classes that satisfy this property (see fsm-
class(3)). It is important that the user ensure the netfsm satisfies
this property since for efficiency reasons it is not checked by the
search module.
DIAGNOSTICS
When an error occurs in DCD library, a diagnostic message is printed on
standard error and then exit(1) is called.
SEE ALSO
drecog(1) Transducer-based speech recognizer
command.
drecog(3) Transducer-based speech recognizer
C++ class.
dmodel(3) DCD library acoustic model inter-
face.
dutils(1) DCD utility user programs.
dutils(3) DCD utility C++ routines.
fsmintro(1) Introduction to the FSM finite-state
machine library.
fsm(1) FSM user commands.
fsmclass(3) FSM class description.
FILES
/n/lvr/linux/bin/dcd-2 Distribution binaries.
/n/lvr/linux/src/cmd/dcd/dcd-2 Distribution sources.
/n/lvr/linux/include/dcd-2 Distribution DCD include files.
/n/lvr/linux/lib/libdcd-2.a Distribution DCD library.
AUTHORS
Michael Riley (riley@research.att.com)
Copyright (C) 2003 AT&T Corp. All rights reserved.
Version 2.0 DSEARCH(3)