Determinize

Description

This operation determinizes a weighted transducer. The result will be an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols (cf. RmEpsilon).

The transducer must be functional. The weights must be (weakly) left divisible (valid for TropicalWeight and LogWeight for instance) and zero-sum-free.

Usage

template <class Arc>
void Determinize(const Fst<Arc> &ifst, MutableFst<Arc> *ofst);
doc
template <class Arc> DeterminizeFst<Arc>::
DeterminizeFst(const Fst<Arc> &fst);
doc
fstdeterminize a.fst out.fst
 

Examples

A:

determinize1.jpg

(TropicalWeight)

Determinize of A:

determinize2.jpg

Determinize(&A);
DeterminizeFst<Arc>(A);
fstdeterminize a.fst out.fst

Complexity

Determinize:
  • Determinizable: exponential (polynomial in the size of the output)
  • Non-determinizable: does not terminate
DeterminizeFst:
  • Determinizable: exponential (polynomial in the size of the output)
  • Non-determinizable: does not terminate

The determinizable automata include all unweighted and all acyclic input.

Caveats

Epsilons may be added as input labels at the ends of paths when determinizing transducers. If input transducer also contains epsilons, this may result in a non-deterministic result even when the epsilons are treated as regular symbols. The subsequential label can be chosen as a non-zero value to avoid this issue by passing it as an option

Non-functional transducers are handled by choosing he determinize type option (in a variant call to this function/class):

  • FUNCTIONAL: give an error for non-functional input (default)
  • NONFUNCTIONAL: permissible when the output ambiguity is finite (p-subsequentiable). The subsequential_label should be non-zero and increment_subsequential_label should be true or the result can be non-deterministic by the default use of epsilons as the p-subsequential labels found at the ends of paths. Care should be taken that these p-subsequential labels (subsequential_label, ..., subsequential_label - p - 1) do not collide with existing labels.
  • DISAMBIGUATE: only the shortest path output for each input is retained, permissible when the semiring has the path property

See Also

Disambiguate, RmEpsilon

References

-- MichaelRiley - 20 Jun 2007

Topic attachments
I Attachment History Action Size Date Who Comment
JPEGjpg determinize1.jpg r2 r1 manage 12.5 K 2007-06-21 - 21:43 MichaelRiley  
JPEGjpg determinize2.jpg r2 r1 manage 13.7 K 2007-06-21 - 21:43 MichaelRiley  
Topic revision: r14 - 2015-07-15 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback