OpenFst Forum 2016 Archive

KyleGorman - 2016-10-11 - 18:52

This returns a pointer, not a reference, now, simply for cross-API consistency (other functions of the form "GetFst" all return a pointer as well). You're welcome to upgrade to any version >= 1.5.3 to get a version that returns a pointer. I can't think of a reasonable way to write code that can handle both, sorry.

EstherJudd - 2016-10-18 - 04:43

Ok, no problem. I will tell our software coders to update to the latest version of openfst.

Log In

Showing characters in the input label in PyFST

JoseOrtiz - 2016-09-28 - 19:41

I'm using the method linear_chain to accept a String. When I convert it into a fst binary to then into a DOT format, I get integers instead of the characters. Also, I have a SymbolTable for each of the corresponding letters being read.

What I need is to show the characters instead, be by the command line or by coding directly into Python. Sorry if this is not the place to ask. Any help or reference would be greatly appreciated.

EstherJudd - 2016-09-29 - 03:51

I have done it on the command line by typing:

fstdraw --isymbols=isyms.txt --osymbols=osyms.txt binary.fst

I haven't been using this in python but maybe you can just do this?:


KyleGorman - 2016-10-11 - 18:55

The OpenFst developers don't contribute to pyfst, so you might be better off asking at their GitHub site:

(Though note that project appears to have been abandoned: the last commit was 3 years ago, and it's unlikely to compile against any reasonable recent version of OpenFst).

JoseOrtiz - 2016-10-18 - 17:45

Thank you both. The suggestion by Esther worked perfectly. I started using PyFST instead of pywrapfst because I was unable to install it correctly in my laptop, however, I still wish to install it. Is the pywrapfst still being maintained?

Once again, thank you for taking the time to answer.

KyleGorman - 2017-01-23 - 17:24

pywrapfst is very much maintained; we use it every day. If you're having installation problems, does it compile?

Log In

+= and *= equivalence functions for weights

MaartenVanCasteren - 2016-08-08 - 11:17

Would it be possible to add functions to the weight templates that work equivalent to += and *= in C++, to avoid unnecessary copying? They could be .Times(const weight& w) and .Plus(const weight& w). The normal Times() and Plus() could then potentially be implemented using these.

KyleGorman - 2016-10-11 - 19:07

That certainly could be done. (Nit: you actually need separate LeftTimes and RightTimes methods, since some semirings are non-commutative as mentioned in I'm not sure this is going to be worth much for the float weight types since everything ought to fit in CPU registers, and for composite weight types, I wonder if there's copy elision here. Do you know for a fact there's not? (I.e., what happens if you compile with -fno-elide-constructors on your system?)

Log In

Label with String

JoseOrtiz - 2016-07-15 - 12:23


I am currently trying to set the Labels to more than one character per Label. Would this be possible?

EstherJudd - 2016-08-23 - 08:33

I have a SymbolTable where the symbols sometimes consist of two characters, so this shouldn't be a problem. I would advise you to work with a SymbolTable that maps each label to a number and train the FST with the numbers instead of the labels.

Log In

pywrapfst SymbolTable allow_negative_labels flag?

LouisaConwill - 2016-07-11 - 15:52


I am writing a Python application to traverse FSTs. I want to read in a binary FST (which does not have a symbol table associated with it), read in a symbol table, and set it as the binary FST's symbol table. The code that I've written looks like this:

fst = isyms = SymbolTable.read_text(args.sym_table) fst.set_input_symbols(isyms)

This code works great when the symbol table does not have negative labels; however, it does not work when the symbol table has negative labels. The error I get is: "ERROR: SymbolTable::ReadText: Bad non-negative integer "-4"" (the symbol table I'm trying to read in has -4 as one if its labels).

Is there a way to set allow_negative_labels to true using either or SymbolTable.read_text? I know this is possible in C++ and when using the openfst command line tool, but I'm struggling to figure out how to do this in Python.

Thank you!

KyleGorman - 2016-10-11 - 19:16

There is a flag --allow_negative_labels for the binaries, but the ability to use them is not exposed to pywrapfst yet. I can do so, but I really wouldn't recommend using negative labels if you can possibly avoid it. I'll update the thread once/if I add this support.

KyleGorman - 2016-10-13 - 20:37

Okay, I've added the ability to specify that you want to permit negative labels for pywrapfst.SymbolTable.read_text. That'll be in the next release. But l I still strongly discourage using negative labels.

Log In

openfst 1.5.3 causes ld to segfault with gcc 4.7.2

DanielPovey - 2016-06-29 - 15:12

On our grid in JHU, when compiling OpenFst 1.5.3, we get a segfault in "ld". I have never heard of "ld" segfaulting before so something pretty funky must be going on. What compiler versions do you think it should work on?

jtrmal@a14 ~/soft/openfst-1.5.3/src/script $ /bin/sh ../../libtool --verbose --tag=CXX --mode=link g++ -std=c++11 -version-info 4:0:0 -o -rpath /usr/local/lib arciterator-class.lo arcsort.lo closure.lo compile.lo compose.lo concat.lo connect.lo convert.lo decode.lo determinize.lo difference.lo disambiguate.lo draw.lo encode.lo encodemapper-class.lo epsnormalize.lo equal.lo equivalent.lo fst-class.lo info.lo intersect.lo invert.lo isomorphic.lo map.lo minimize.lo print.lo project.lo prune.lo push.lo randequivalent.lo randgen.lo relabel.lo replace.lo reverse.lo reweight.lo rmepsilon.lo script-impl.lo shortest-distance.lo shortest-path.lo stateiterator-class.lo synchronize.lo text-io.lo topsort.lo union.lo weight-class.lo verify.lo ../lib/ -lm -ldl libtool: link: rm -fr .libs/ libtool: link: g++ -fPIC -DPIC -shared -nostdlib /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/4.7/crtbeginS.o .libs/arciterator-class.o .libs/arcsort.o .libs/closure.o .libs/compile.o .libs/compose.o .libs/concat.o .libs/connect.o .libs/convert.o .libs/decode.o .libs/determinize.o .libs/difference.o .libs/disambiguate.o .libs/draw.o .libs/encode.o .libs/encodemapper-class.o .libs/epsnormalize.o .libs/equal.o .libs/equivalent.o .libs/fst-class.o .libs/info.o .libs/intersect.o .libs/invert.o .libs/isomorphic.o .libs/map.o .libs/minimize.o .libs/print.o .libs/project.o .libs/prune.o .libs/push.o .libs/randequivalent.o .libs/randgen.o .libs/relabel.o .libs/replace.o .libs/reverse.o .libs/reweight.o .libs/rmepsilon.o .libs/script-impl.o .libs/shortest-distance.o .libs/shortest-path.o .libs/stateiterator-class.o .libs/synchronize.o .libs/text-io.o .libs/topsort.o .libs/union.o .libs/weight-class.o .libs/verify.o -Wl,-rpath -Wl,/home/jtrmal/soft/openfst-1.5.3/src/lib/.libs ../lib/.libs/ -ldl -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../../lib -L/lib/x86_64-linux-gnu -L/lib/../lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib -L/usr/lib/gcc/x86_64-linux-gnu/4.7/../../.. -lstdc++ -lm -lc -lgcc_s /usr/lib/gcc/x86_64-linux-gnu/4.7/crtendS.o /usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crtn.o -Wl,-soname -Wl, -o .libs/ collect2: error: ld terminated with signal 11 [Segmentation fault]

DanielPovey - 2016-06-29 - 19:40

BTW, this is on debian 7.10. Apparently OpenFst 1.5.3 says it should work from gcc version 4.7 which is what we are using.

DanielPovey - 2016-06-29 - 20:25

I have found more information from valgrind and submitted a bug-report in bugzilla, I think it's a linux binutils bug.

DanielPovey - 2016-06-29 - 21:40

I have documented in the last comment on that the bug can be fixed by taking a certain line from a more recent binutils version 2.26 (this fix works for me; I have to compile binutils from source with the fix). So it looks like it's not a bug in OpenFst but a bug in binutils that it just happens to trigger. Our binutils is version 2.22.


Log In

ShortestDistance very slow?

PafnoutiT - 2016-06-21 - 11:42

I have an fst (actually a Kaldi lattice, it's an acyclic graph in the tropical semiring) on which I wanted to use fst::ShortestDistance to the final state, however it's very slow. It takes about 1.5 sec to process my lattice, whereas if I loop myself over the states and arcs to compute the same quantity it happens in 0.03 seconds. What could I have done wrong?

KyleGorman - 2016-10-11 - 19:22

It's possible that automatic queue selection is failing because certain property bits are not yet set. You could try turning up logging to see what queue is selected (see, or if you're comfortable, could simply assert that the FST is (e.g.,) acyclic (and possibly topologically sorted) using SetProperties() before calling ShortestDistance. It is also possible that the explicit queue is less efficient than manual iteration for this simple case, in which case you should feel free to stick with your implementation.

Log In

Shortest distance very slow?

PafnoutiT - 2016-06-21 - 11:41

I have an fst (actually a Kaldi lattice, it's an acyclic graph in the tropical semiring) on which I wanted to use fst::ShortestDistance, however it's very slow. It takes about 1.5 sec to process my lattice, whereas if I loop myself over the states and arcs to compute the same quantity it happens in 0.03 seconds. What could I have done wrong?

Log In

Python extension compilation error

AlexBogatu - 2016-06-18 - 12:50

Hi, I am trying to use the python extension but I have some problems when I compile the source. I have done the following: - download openfst 1.5.3 - ./configure with --enable-far - make & make install - pip install openfst

This last command fails with the following error:

/usr/local/include/fst/script/fst-class.h:277:3: note: fst::script::FstClass::FstClass()

FstClass() : impl_(nullptr) {}


/usr/local/include/fst/script/fst-class.h:277:3: note: candidate expects 0 arguments, 1 provided

At global scope: cc1plus: warning: unrecognized command line option "-Wno-unneeded-internal-declaration" [enabled by default] error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Am I missing something? Thank you!

MarxerR - 2016-08-10 - 10:54

I get a similar error with clang:

building 'pywrapfst' extension

clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c -o build/temp.macosx-10.11-x86_64-2.7/pywrapfst.o -std=c++11 -Wno-unneeded-internal-declaration -Wno-unused-function error: no matching constructor for initialization of 'fst::script::FstClass'

__pyx_v_tfst = new fst::script::FstClass(__pyx_v_self->_reader->GetFstClass());

^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/usr/local/include/fst/script/fst-class.h:281:3: note: candidate constructor not viable: no known conversion from 'const fst::script::FstClass *' to 'const fst::script::FstClass' for 1st argument; dereference the argument with *

FstClass(const FstClass &other)


/usr/local/include/fst/script/fst-class.h:383:12: note: candidate constructor not viable: no known conversion from 'const fst::script::FstClass *' to 'fst::script::FstClassImplBase *' for 1st argument

explicit FstClass(FstClassImplBase *impl) : impl_(impl) {}


/usr/local/include/fst/script/fst-class.h:279:12: note: candidate template ignored: could not match 'Fst<type-parameter-0-0>' against 'const fst::script::FstClass *'

explicit FstClass(const Fst<Arc> &fst) : impl_(new FstClassImpl<Arc>(fst)) {}


/usr/local/include/fst/script/fst-class.h:276:3: note: candidate constructor not viable: requires 0 arguments, but 1 was provided

FstClass() : impl_(nullptr) {}


1 error generated.

error: command 'clang' failed with exit status 1

KyleGorman - 2016-10-11 - 19:24

There's no reason to call pip. The version up there is not compatible with arbitrary OpenFst versions. If you do --enable-python then make and make install, you have already installed pywrapfst for whatever version of OpenFst you're building.

EltonJohn - 2020-10-07 - 01:59

error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Am I missing something?

Most of the time these are dependency-issues. You need to install a package called python-dev. This package includes header files, a static library and development tools for building Python modules, extending the Python interpreter or embedding Python in applications. When encountering this error please note before the error it may say you are missing a package or header file you should find those and install them and verify if it works

For Python 2.x use:

$ sudo apt-get install python-dev

If you using python3, try to replace python-dev with python3-dev

For a specific version of Python 3, replace x with the minor version in

$ sudo apt-get install python3.x-dev

Log In

Changing Stateid value.

JoseOrtiz - 2016-06-18 - 12:30

I'm trying to read files in GFA format and then parse into an openfst format. In GFA format, each new line starts with a character that identifies what type of line it is. What I've done so far is if it starts with H, it ignore the rest of the line and go the next one. If it starts with S, it means it's a state, so it proceeds to identify each part of the line which consist of five strings including the S. After this, I add a state, but I want my state to hold the value of one of the strings identified in the line, which is a numerical value. As I understand, when you use an AddState, it return a Stateid which begins at zero, then increments by one. If it helps, I'm currently using a StdVectorFst object for the AddState function. I am reading about the SetState function from the vector_fst_base_impl class, would I be able to use this funcion to first get the StateId I want to change and then assign the new value? If so, would this cause any errors?


DoganCan - 2016-06-19 - 16:40

Hi Jose

In OpenFst, state IDs are integers. They are used for uniquely identifying states. They are not tied to the states they identify. Most OpenFst operations do not preserve state IDs.

Cheers, Dogan

Log In

fst/fstlib.h header not found

PaulPoveda - 2016-05-31 - 10:31

Hello everyone,

I tried to install OpenFst to use it with SphinxTrain and it seem to work but I got stuck with the following error :

checking fst/fstlib.h usability... no checking fst/fstlib.h presence... no checking for fst/fstlib.h... no configure: error: fst/fstlib.h header not found

But the OpenFst binaries are well installed in /usr/local/lib/fst and the header too in /usr/local/include/fst..

Can anyone give me a clue about the issue ?

KyleGorman - 2016-10-11 - 19:26

Assuming that /usr/local/include/fst/fstlib.h exists and is readable for your user, you may need to ensure that that directory is in your include path. At the compiler level you will want to use the -I flag to pass the directory, but I'm not sure how to do this with SphinxTrain. There may be something wrong with your environment (this directory should very much be in your include path) or with the SphinxTrain configuration setup, but this doesn't seem to be an issue with OpenFst per se.

Log In

ArcIterator construction is very slow with delayed composition

RyanPrenger - 2016-05-03 - 16:04

Hi. I've written a decoder loosely based on Kaldi's simple-decoder that is reasonably fast on an FST that was composed offline (lex + language model). When I try using the decoder on the FST created with delayed composition it's around 100 times slower. All the time seems to be spent in the ArcIterator construction steps. The language model is around 500 megs, and the lex is around 15 megs. Would one expect such a large slow down? How do people usually get decoders to perform with a large language model (the composition results in an FST about double the size)?

KyleGorman - 2016-10-11 - 19:34

It could be quite slow to work with a delayed composition, particularly if the offline composition is slow. The former is used to implement the latter, and many tricks which speed up offline composition should make working with a delayed composition faster too. These include use of specific matchers (or modifying the inputs, e.g. with arc-sorting or setting the FST's properties) so that ComposeFst chooses the best matcher), state tables, and composition filters (see, and selecting the best associativities for cascaded compositions. You might just try to make the offline composition fast and then port that solution to the delayed composition-based decoder.

Log In

Speed of weighted FST loading in c++

EstherJudd - 2016-04-12 - 04:28

I have created a large FST for g2p. Using this command I have been able to get some speed savings when loading. I apply this command in a shell script when building the FST. <br> <verbatim>fstconvert --fst_type=const --fst_align=true in.ofst out.ofst</verbatim> <br> I have also been able to reduce the size of my FST by creating a compressed version using: <br> <verbatim>fstcompress --gzip=true out.fst out_compr.fst</verbatim> <br> Now I read in my final FST in c++ but I use the StdVectorFst. I was reading that it is faster to use StdConstFst but when I use that, I get an error saying that that class is not derived from MutableFst. I use the following command to read in the compressed FST: <br> <verbatim> StdVectorFst *decompressed = new StdVectorFst(); Decompress(remap_fst_name,decompressed,true); </verbatim> <br> So my question is, how can I increase the load speed for this large FST?

MichaelRiley - 2016-05-27 - 10:11

fstcompress is for small file footprint; expands on load and is not fast. constfst is a bit faster to load than vectorfst not memory-mapped and much faster memory-mapped (but then is demand paged as used). you can't mutate a constfst.

EstherJudd - 2016-06-14 - 04:11

Thanks, I guess my question is if I convert a VectorFST to a constFST using the fstconvert command, what is the best way to load it faster in c++? I found this website that talks about this conversion and when I type fst-info it does say it is memory-mapped:

And if I did want to make the resulting FST smaller without losing accuracy, what other things can I do besides fstcompress?

KyleGorman - 2016-10-11 - 19:38

One generic method for reducing the size (in terms of arcs and states) is to use the Optimize method in optimize.h (a header shipped with Thrax and Pynini). It merges identically-labeled multi-arcs, removes epsilons, and applies determinization and minimization where they are known to terminate. See the documentation there for the specific details. It does about as well as one can hope to do automatically without knowing how the FST will be used.

Log In

Possible bug in ReplaceFstMatcher?

GautamT - 2016-02-25 - 22:59

ReplaceFstMatcher in replace.h instantiates MultiEpsMatcher for each of the component fst with mode kMultiEpsList and non terminal labels as possible epsilon. It does this for both MATCH_INPUT and MATCH_OUTPUT case. Under the condition ofMATCH_INPUT, this seems wrong to me. Let's look at the 2 possible scenarios under MATCH_INPUT.

1. epsilon on replace is false: Under this condition input labels continue to remain and do not become epsilons. Input symbols could represent a domain different from output symbols and therefore input labels with values matching those of the non terminals shouldn't be considered epsilons. Replace operation takes place based on the output label and not input label. Current behaviour would match self loop (kNoLabel) against these. Dropping this behaviour would also result in performance gain since one would not have to do a Find on the underlying machine for each of the non terminal label to match against self loop.

2. epsilon on replace is true: Under this condition, input label accompanying a non terminal output label is always re-written to epsilon irrespective of whether or not its value is also that of a non terminal. Matching self loop against input in this case is tricky because one would need to match against corresponding output label. To begin input labels are sorted to use this matcher, to match against output labels that are not sorted does not seem to be a good idea.

My thoughts are that for MATCH_INPUT if epsilon on replace is false then we don't need to use MultiEpsMatcher for each underlying machine, or not perform AddMutlipleEpsLabel. If epsilon on replace is true then not use this matcher and the user wouldn't be able to take advantage of the option to not always cache. The other possible option would be to mandate that if epsilon on replace is true then input label match that of output label for non terminal arc and check this on detecting a non terminal label in the output arc.

To confirm to myself that this is a bug, I constructed a root fst and a sub fst and created a replace fst that is used for a composition with another fst. The replace fst is the rhs fst or fst2 so that it is matched on the input side. The fsts are created in such a way as to expose this problem. The result was that if one were to not do a lazy replace, i.e. fully expand it and then use it for composition, one would get a different result from using a lazy replace or replace fst. For reproducibility, here are the fsts:

root fst 0 1 1 5 1 sub fst 0 1 5 7 1

fst1 or lhs fst for composition 0 1 1 1 1 expanded fst (fstreplace root 10 sub 5 expanded) using 5 as the non terminal to refer to sub fst and root label being a non existent label (non recursive) 0 1 1 0 1 2 5 7 2 3 0 0 3

As can be seen the composition fst1 o expanded fst would result in an empty fst. However when not expanding and doing the composition it would result in a non empty fst with a final state since ReplaceFstMatcher would treat the input label 5 in the replace fst from the sub fst as epsilon and therefore composition proceeds.

Am I missing something?

GautamT - 2016-02-25 - 23:01

Somehow the fst formatting was off, fixing it.

root fst

0 1 1 5

1 sub fst 0 1 5 7 1

fst1 or lhs fst for composition

0 1 1 1


expanded fst (fstreplace root 10 sub 5 expanded) using 5 as the non terminal to refer to sub fst and root label being a non existent label (non recursive)

0 1 1 0

1 2 5 7

2 3 0 0


Log In

G.fst in ngram format work fail on arm

JerryLin - 2016-02-20 - 09:13

Hi all, I built G.fst in ngram format on x86 , and run on x86. it works fine I built G.fst in ngram format on arm , and run on arm, it throws bad_malloc exception. if you built G.fst in vector format on arm, and run on arm, it works fine. How to solve it ? Thank you.

MichaelRiley - 2016-05-27 - 10:13

Possibly changed endianity when crossing platforms. Fix - don't do that.

KyleGorman - 2016-10-11 - 19:40

Right, because IO is memory-mapped, don't expect serialized FSTs to work on a platform with a different endianity. It's a small price to pay for faster IO.

Log In

How to compute edit distance of a given text line

SopheaPRUM - 2016-02-18 - 20:50


Given a text line containing mistakes (Example "AL WUYS LO W PRICES", the correct text line should be "ALWAYS LOW PRICES"). Considering that i have a lexicon and language model. How can i compute the edit distance of this given text line?

Please help Best regards

KyleGorman - 2016-10-11 - 19:43

The last example here ( involves making an edit transducer and then computing the edit distance.

The paper on Pynini ( has an example of using a noisy channel model and a language model to resolve this kind of ambiguity, as well.

Log In

TypeError: unsupported operand type(s) for -: 'int' and 'RestrictedProperty'

MariamAAMER - 2016-02-04 - 13:01

Hi, I am trying to create a script. The part that doesn't work is the pasted below: class MyGrating(Structure): wg_W = PositiveNumberProperty(required = True) wg_L = PositiveNumberProperty(required = True) taper_L = PositiveNumberProperty(required = True) wg_W_Grating = PositiveNumberProperty(required = True) period_Grating = PositiveNumberProperty(required = True) DC_Grating = PositiveNumberProperty(required = True) n_periods_Grating = PositiveNumberProperty(required = True) width_Grating = PositiveNumberProperty(required = True) Out_wg = PositiveNumberProperty(default=1)# 1 si queremos que dibuje guia, 0 si no la queremos def define_elements(self, elems): inside the define_elements, i have a for loop, and inside it, i have todo make an arithmetic operation between two of the above defined properties. I get an error message that says: TypeError: unsupported operand type(s) for -: 'int' and 'RestrictedProperty' Somebody could help me to resolve it. Thanks to everybody

Log In

Linking problem on 64bit arch

DodoIvanecky - 2016-01-08 - 09:54

I am trying to create a simple application with openFST. While fst.Write is working for my simple graph, functions like RmEpsilon or Determinize are not possible to link. In the first case I am getting: undefined reference to `fst::ReverseProperties(unsigned long, bool)' and it is O.K. since: uint64 ReverseProperties(uint64 inprops, bool has_superinitial); But why/from where is fst::ReverseProperties(unsigned long, bool) called, I have no idea. Any hint is appreciated.

DodoIvanecky - 2016-01-08 - 11:22

Looks like I see the same problem with several Property function. Calling Determinize I got undefined reference to `fst::DeterminizeProperties(unsigned long, bool, bool)'. Which is O.K. because: uint64 DeterminizeProperties(uint64 inprops, bool has_subsequential_label, bool distinct_psubsequential_labels)

DodoIvanecky - 2016-01-08 - 12:33

Solved. Problem is compiler. Needs to be at least 4.8. The reported problem was with 4.6 version.

Log In

-- Michael Riley - 2022-11-11

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2022-11-12 - MichaelRiley
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2023 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback