TWiki
>
GRM Web
>
NGramLibrary
>
NGramQuickTour
>
NGramShrink
(2012-03-08,
MichaelRiley
)
(raw view)
E
dit
A
ttach
---+ NGramShrink ---++ Description This operation _shrinks_ or prunes an n-gram language model in one of three ways: * _count pruning:_ prunes based on count cutoffs for the various n-gram orders specified by =count_pattern=. * _relative entropy_: prunes based on a relative entropy criterion =theta=. * _Seymore_: prunes based on the Seymore-Rosenfeld criterion =theta=. The C++ classes are all derived from the base class =NGramShrink=. ---++ Usage |<verbatim> ngramshink [--opts] [in.mod [out.mod]] --method: type = string, one of: count_prune (default) | relative_entropy | seymore --count_pattern: type = string, default = "" --theta, type = double, default = 0.0 </verbatim> | | |<verbatim> class NGramCountPrune(StdMutableFst *model, string count_pattern); </verbatim>| | |<verbatim> class NGramRelativeEntropy(StdMutableFst *model, double theta); </verbatim>| | |<verbatim> class NGramSeymoreShrink(StdMutableFst *model, double theta); </verbatim>| | ---++ Examples <verbatim> ngramshrink --method=relative_entropy --theta=1.0e-7 in.mod >out.mod </verbatim> --- <verbatim> StdMutableFst *model = StdMutableFst::Read("in.mod", true); NGramRelativeEntropy ngram(model, 1.0e-7); ngram.ShrinkModel() ngram.GetFst().Write("out.mod"); </verbatim> ---++ Caveats For relative entropy or Seymore shrinking, the input n-gram model must be weight-normalized (the probabilities at each state must sum to 1). For count pruning, either a normalized model or raw, unnormalized counts can be used. ---++ References K. Seymore and R. Rosenfeld. "Scalable Backoff Language Models", _Proc. of International Conference on Speech and Language Processing_. 1996. A. Stolcke. "Entropy-based Pruning of Backoff Language Models", _Proc. of DARPA Broadcast News Transcription and Understanding Workshop_. 1998.
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r7
<
r6
<
r5
<
r4
<
r3
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r7 - 2012-03-08
-
MichaelRiley
GRM
Log In
or
Register
GRM Web
Create New Topic
Index
Search
Changes
Notifications
Statistics
Preferences
Webs
Contrib
FST
Forum
GRM
Kernel
Main
Sandbox
TWiki
Main
Copyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback