# OpenFst Forum

You need to be a registered user to participate in the discussions.

You can start a new discussion here:

You can use the formatting commands describes in TextFormattingRules in your comment.
If you want to post some code, surround it with <verbatim> and </verbatim> tags.
Auto-linking of WikiWords is now disabled in comments, so you can type VectorFst and it won't result in a broken link.
You now need to use <br> to force new lines in your comment (unless inside verbatim tags). However, a blank line will automatically create a new paragraph.
 Subject Comment

## weight type conversion in fst

### IkeYuki - 2017-12-10 - 23:08

Is there a method to convert fst's weight? I want to get VectorFst<LogArc> from VectorFst<StdArc> such as HCLG.fst.

### IkeYuki - 2017-12-15 - 07:19

I tried to use Arcmap, but an error occured. <verbatim> fst::MutableFst<fst::LogArc> *fst; fst::ArcMap<fst::LogArc,fst::WeightConvertMapper<fst::StdArc,fst::LogArc>>(*fst,fst::WeightConvertMapper<fst::StdArc,fst::LogArc>()); </verbatim>

<verbatim> No matching function for call to ArcMap(fst::MutableFst<fst::ArcTpl<fst::LogWeightTpl<float> > >&, fst::WeightConvertMapper<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::ArcTpl<fst::LogWeightTpl<float> > >)' </verbatim>

Any ideas for this? Thanks for advance.

## Intersect operation gives error on symbol tables mismatch

### SanJoshi - 2017-11-16 - 04:31

I want to check if a string exists in a set of strings. I created two FSTs

The first FST contains the set of all strings as shown here http://www.openfst.org/twiki/bin/view/Forum/FstForum#Making_an_FST_with_C_How_to_defi

The second SearchFST contains a single string (i.e. State1 -> a/x -> State2 -> b/y -> FinalState)

The input and output symbol tables for both FST are the same.

Now I sorted both the FSTs, the first on output labels, the second on input labels

ArcSort(&set_fst, StdOLabelCompare()) ArcSort(&search_fst, StdILabelCompare())

When I call StdIntersectFst output_fst(set_fst, search_fst)

I get this error WARNING: CompatSymbols: Symbol table checksums do not match. Table sizes are 3 and 3 FATAL: ComposeFst: Output symbol table of 1st argument does not match input symbol table of 2nd argument

Slide 28 here http://www.openfst.org/twiki/pub/FST/FstSltTutorial/part1.pdf seems to depict that both FSTs in an Intersect have the same symbol table.

What am I doing wrong ?

## Get raw arc arrays from VectorFst or ArcIterator.

### JustinLuitjens - 2017-11-09 - 16:45

Is there a method to get the raw arc arrays from either an Fst or an ArcIter? I'd like to be able to parallelism across an ArcIterator loop but to do that I need a way to get each arc in constant time.

### KyleGorman - 2017-11-21 - 11:53

There's no guarantee that an FST type will have a precomputed arc array, so I'd just do something like:

std::vector<Arc> arcs; for (ArcIterator<Fst<Arc>> aiter(fst, state); aiter.Done(); aiter.Next()) { arcs.push_back(aiter.Value()); }

## Error in fstconvert in openfst-1.6.5

### EricWang - 2017-11-06 - 07:14

Hi, I have tried to run "./tools/openfst/openfst-1.6.5/bin/fstconvert --fst_type=olabel_lookahead --save_relabel_opairs=tmp.1 tmp.111> tmp.222" I got a error info as follows: ERROR: GenericRegister::GetEntry: lookup failed in shared object: olabel_lookahead-fst.so FATAL: Fst::Convert: Unknown FST type olabel_lookahead (arc type log)

But If I run above command using fstconvert in openfst-1.1.2.10, It is ok! I guess there some wrong in compiling fstconvert in openfst-1.6.5, but I don't know how to fix it. Could anyone give some helps? Thanks!

Eric

### KyleGorman - 2017-11-21 - 11:55

You need to make sure you build lookahead FST SOs during compilation (--enable-lookahead-fsts) and that the generated SOs are in your LD_LIBRARY_PATH, or fstconvert won't be able to find them.

(1.1.2 is truly ancient---can't help you with why this would have worked 9 years ago.)

## python minimize allow_nondet

### DavidVanLeeuwen - 2017-11-03 - 12:05

Hello,

I think there is a small bug in pywrapfst, the parameter allow_nondet seems not to be transferred to the underlying c-call. If openfst's development were organized on github, I'd sent in a pull request.

Here is a diff that I think shows the bug, however, I can't get it working as I don't understand the build process. There is a pywrap.cc file that even has a "CYTHON_UNUSED" or something in a relevant place, but how the file is built is beyond my grasp. Maybe someone can have a look at this, that would be great.

Cheers,

david

* pywrapfst.pyx-orig 2017-11-03 14:43:48.072896578 +0100

pywrapfst.pyx 2017-11-03 14:44:04.829572772 +0100 ************* * 2019,2025 ** Returns: self. """ ! self._minimize(delta) return self

cpdef MutableArcIterator mutable_arcs(self, int64 state):

2019,2025 ---- Returns: self. """ ! self._minimize(delta, allow_nondet) return self

cpdef MutableArcIterator mutable_arcs(self, int64 state):

### KyleGorman - 2017-11-03 - 15:35

Thanks, we'll fix that on our side. (A PR wouldn't be helpful because OpenFst's source is generated programmatically from proprietary code we develop within Google and I can't think of any serious alternative to that workflow that'd work for us.)

pywrapfst.cc is generated by Cython from pywrapfst.pyx, and then can be compiled normally. (We just build straight from the .cc though. Thus building this doesn't require that you have a copy of Cython, unless you want to hack the .pyx first.)

### DavidVanLeeuwen - 2017-11-04 - 07:02

Thanks for the explanation.

I hack-fixed it for my own application now by changing a (bool)0 into a (bool)1 somewhere in the .cc.

I couldn't get an explicit determinize to work, the toy example

0 1 a a 0 2 a a 1 3 b b 2 3 b b 3

gives an empty result using cmdline openfst tools, and similarly for the pywrapfst interface.

Cheers,

david

### KyleGorman - 2017-11-21 - 11:57

is the compiled FST from your toy example well-formed? (what does fstinfo say about it?)

### EditaStewart - 2017-12-04 - 11:42

https://www.echoes-of-war.com

## python3 and configure

### PavelDenisov - 2017-10-28 - 15:22

Running "./configure --enable-python" with python3 produces following error:

<verbatim> checking for a Python interpreter with version >= 2.7... python checking for python... /usr/bin/python checking for python version... 3.5 checking for python platform... linux checking for python script directory... ${prefix}/lib64/python3.5/site-packages checking for python extension module directory...${exec_prefix}/lib64/python3.5/site-packages checking for python3.5... (cached) /usr/bin/python checking for a version of Python >= '2.1.0'... File "<string>", line 1 import sys, string; ver = string.split(sys.version)[0]; print ver >= '2.1.0' ^ SyntaxError: invalid syntax </verbatim>

It looks like python stuff in the configure script is not designed to work with python3. I would like to try to port it. Are there any reasons for not doing this?

### KyleGorman - 2017-11-03 - 15:32

That error is coming from Python invoked by the autoconf macro, looks like. So we can fix that but maybe it's better to fix it upstream there.

Before submitting you may want to test Python 3 with the Python extension. We still develop on Python 2.7, exclusively, unfortunately, and it's not tested on Python 3.

### PavelDenisov - 2017-11-04 - 12:25

OK, thank you. I'll keep that in mind.

### ChristopherKermorvant - 2017-11-07 - 05:38

Hi Pavel, Do you have a version of ./configure working with python3 ? I am also trying to install pywrapfst with python3 Thank you

### PavelDenisov - 2017-11-08 - 18:02

I was able to overcome the configure problem by taking ax_python_devel.m4 from fresh autoconf-archive distribution.

However other things must be taken care of before pywrapfst can be used with python3. At the moment I'm getting such error:

<verbatim>

pywrapfst.cc: In function 'void __Pyx_CppExn2PyErr()': pywrapfst.cc:2448:7: error: exception handling disabled, use -fexceptions to enable throw; ^~~~~ pywrapfst.cc:2450:40: error: 'exn' was not declared in this scope PyErr_SetString(PyExc_MemoryError, exn.what());

</verbatim>

## How recover state mapping after ShortestPath?

### JeffToprak - 2017-09-17 - 12:41

I have the problem that I am extracting the shortest path of an FST via ShortestPath. However, the resulting FST has entirely new state IDs, and I would like have a map between the old state IDs and the new IDs (I have auxiliary information on the origin state IDs I want to carry over). How would I go about that? I guess I could try to "look up" the shortest path by recursively iterating through the original FST, matching ilabel/olabels , but is there a better way?

### MichaelRiley - 2017-09-22 - 15:25

Assuming your FST is an FSA you could:

1. make it an FST whose output label encoded the state/arc ID info OR
2. You can intersect the FST with the input but pass the state table as an option first in IntersectFstOptions. The state table stores the mapping to and from the input state pairs and the output state.

Would be nice if Compose/IntersectFst offered GetStateTable() but issues with getting the right type of the return.

## Pywrapfst API

### AkuanLiu - 2017-09-13 - 06:41

Hi,

I'm having a bit of difficulty obtaining the API documentation for the Python wrapper of Openfst. I would like to use some more advanced methods such as epsilon removal and determinization, but am unsure as to how to call them.

Any help would be greatly appreciated. Thanks!

### KyleGorman - 2017-10-12 - 17:25

Like most Python packages, they're documented extensively within. E.g.:

>>> import pywrapfst

>>> help(pywrapfst.determinize)

determinize(...)

determinize(ifst, delta=0.0009765625, det_type="functional", nstate=NO_STATE_ID, subsequential_label=0, weight=None, incremental_subsequential_label=False)

Constructively determinizes a weighted FST.

This operations creates an equivalent FST that has the property that no state has two transitions with the same input label. For this algorithm, epsilon transitions are treated as regular symbols (cf. rmepsilon).

Args:

ifst
The input FST.
delta
Comparison/quantization delta.
det_type
Type of determinization; one of: "functional" (input transducer is functional), "nonfunctional" (input transducer is not functional) and disambiguate" (input transducer is not functional but only keep the min of ambiguous outputs).
nstate
State number threshold.
subsequential_label
Input label of arc corresponding to residual final output when producing a subsequential transducer.
weight
A Weight or weight string indicating the desired weight threshold below which paths are pruned; if omitted, no paths are pruned.
increment_subsequential_label
Increment subsequential when creating several arcs for the residual final output at a given state.

Returns: An equivalent deterministic FST.

Raises:

FstArgError
Unknown determinization type.

See also: disambiguate, rmepsilon.

## Python installation difficulties with both methods

### PeterChoy - 2017-09-11 - 08:48

Hi all,

I am trying to use the python wrapper for openfst. I have tried both methods below, and cannot get either to work:

1. I configure using the configuration flag 'enable-python', run 'make' and then 'make install'. Running 'import pywrapfst' in Python gives me 'No module named pywrapfst'.

2. I run 'pip install openfst'. Running 'import pywrapfst' in Python gives me: 'ImportError: libfstfarscript.so.8: cannot open shared object file: No such file or directory'

I can't find anyone with the same issue: some help or ideas of how to troubleshoot would be deeply appreciated.

### PeterChoy - 2017-09-11 - 08:58

P.S I am using Python 2.7, trying to install openfst-1.6.1

### KyleGorman - 2017-09-11 - 10:09

If make install with enable-python ran, then there is somewhere a pywrapfst.so on your system. Find it, and make sure it's in your Python path (sys.path from within Python) for whatever Python installation you're using. (Many people have multiple installations of Python on their system.) You can do this by adjusting your Python path or moving pywrapfst.so into the existing path.

Using pip for this doesn't make much sense, and that's out of date anyways.

## how to implement an on the fly fst replacement？

### XingW - 2017-09-07 - 23:39

Dear all， I'm a newbie of openfst，nowadays I want to implement an on the fly fst replacement, could anyone teach me something about how to achieve the on the fly function in openfst? Thanks very much in advance!

### KyleGorman - 2017-09-08 - 11:15

Construct a fst::ReplaceFst with your RTN specification, then pass it downstream to other algorithms which will expand it. Use the cache options struct (and/or command-line flags) to control how caching behaves.

### XingW - 2017-09-08 - 23:07

Dear Kyle, thanks for your kindly reply, could you teach me something about on the fly? I know there is a command line named fstreplace, but I have no idea about how to implement the on the fly, is it means online?

### KyleGorman - 2017-09-11 - 10:06

"on-the-fly" here means something similar to "lazy" in the sense of "lazy evaluation". It computes the FST corresponding to the RTN "as needed" by other operations; optionally, it stores recently visited states in the computation in a in-memory cache of size specified by the user. This is useful when the corresponding FST is large or only a small portion of it will be visited by subsequent operations.

You can't use lazy computation from the command-line, so fstreplace isn't really doing on-the-fly computation in any relevant sense. The only way to take advantage of lazy computation is through the lowest-level C++ API.

## Error compiling OpenFst

### LouNicotra - 2017-09-01 - 10:37

Hi, getting this error while running checks. Trying to install as a requisite for pyfst.

root@tiger21 openfst-1.6.3# ./configure --enable-python checking for a BSD-compatible install... /bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... gawk checking whether make sets $(MAKE)... yes checking whether make supports nested variables... yes checking for style of include used by make... GNU checking for gcc... gcc checking whether the C compiler works... yes checking for C compiler default output file name... a.out checking for suffix of executables... checking whether we are cross compiling... no checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking whether gcc understands -c and -o together... yes checking dependency style of gcc... gcc3 checking for ar... ar checking the archiver (ar) interface... ar checking for g++... g++ checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking dependency style of g++... gcc3 checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu checking how to print strings... printf checking for a sed that does not truncate output... /bin/sed checking for grep that handles long lines and -e... /bin/grep checking for egrep... /bin/grep -E checking for fgrep... /bin/grep -F checking for ld used by gcc... /bin/ld checking if the linker (/bin/ld) is GNU ld... yes checking for BSD- or MS-compatible name lister (nm)... /bin/nm -B checking the name lister (/bin/nm -B) interface... BSD nm checking whether ln -s works... yes checking the maximum length of command line arguments... 1572864 checking whether the shell understands some XSI constructs... yes checking whether the shell understands "+="... yes checking how to convert x86_64-unknown-linux-gnu file names to x86_64-unknown-linux-gnu format... func_convert_file_noop checking how to convert x86_64-unknown-linux-gnu file names to toolchain format... func_convert_file_noop checking for /bin/ld option to reload object files... -r checking for objdump... objdump checking how to recognize dependent libraries... pass_all checking for dlltool... no checking how to associate runtime and link libraries... printf %s\n checking for archiver @FILE support... @ checking for strip... strip checking for ranlib... ranlib checking command to parse /bin/nm -B output from gcc object... ok checking for sysroot... no checking for mt... no checking if : is a manifest tool... no checking how to run the C preprocessor... gcc -E checking for ANSI C header files... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for dlfcn.h... yes checking for objdir... .libs checking if gcc supports -fno-rtti -fno-exceptions... no checking for gcc option to produce PIC... -fPIC -DPIC checking if gcc PIC flag -fPIC -DPIC works... yes checking if gcc static flag -static works... no checking if gcc supports -c -o file.o... yes checking if gcc supports -c -o file.o... (cached) yes checking whether the gcc linker (/bin/ld -m elf_x86_64) supports shared libraries... yes checking whether -lc should be explicitly linked in... no checking dynamic linker characteristics... GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking whether stripping libraries is possible... yes checking if libtool supports shared libraries... yes checking whether to build shared libraries... yes checking whether to build static libraries... no checking how to run the C++ preprocessor... g++ -E checking for ld used by g++... /bin/ld -m elf_x86_64 checking if the linker (/bin/ld -m elf_x86_64) is GNU ld... yes checking whether the g++ linker (/bin/ld -m elf_x86_64) supports shared libraries... yes checking for g++ option to produce PIC... -fPIC -DPIC checking if g++ PIC flag -fPIC -DPIC works... yes checking if g++ static flag -static works... no checking if g++ supports -c -o file.o... yes checking if g++ supports -c -o file.o... (cached) yes checking whether the g++ linker (/bin/ld -m elf_x86_64) supports shared libraries... yes checking dynamic linker characteristics... (cached) GNU/Linux ld.so checking how to hardcode library paths into programs... immediate checking for a Python interpreter with version >= 2.7... python checking for python... /var/local/miniconda2/bin/python checking for python version... 2.7 checking for python platform... linux2 checking for python script directory...${prefix}/lib/python2.7/site-packages checking for python extension module directory... ${exec_prefix}/lib/python2.7/site-packages checking for python2.7... (cached) /var/local/miniconda2/bin/python checking for a version of Python >= '2.1.0'... yes checking for a version of Python >= '2.7'... yes checking for the distutils Python package... yes checking for Python include path... -I/var/local/miniconda2/include/python2.7 checking for Python library path... -L/var/local/miniconda2/lib/python2.7 -lpython2.7 checking for Python site-packages path... /var/local/miniconda2/lib/python2.7/site-packages checking python extra libraries... -lpthread -ldl -lutil checking python extra linking flags... -Xlinker -export-dynamic checking consistency of all components of python development environment... yes checking for dlopen in -ldl... yes checking that generated files are newer than configure... done configure: creating ./config.status config.status: creating Makefile config.status: creating src/Makefile config.status: creating src/include/Makefile config.status: creating src/lib/Makefile config.status: creating src/bin/Makefile config.status: creating src/test/Makefile config.status: creating src/extensions/Makefile config.status: creating src/extensions/compact/Makefile config.status: creating src/extensions/compress/Makefile config.status: creating src/extensions/const/Makefile config.status: creating src/extensions/far/Makefile config.status: creating src/extensions/linear/Makefile config.status: creating src/extensions/lookahead/Makefile config.status: creating src/extensions/mpdt/Makefile config.status: creating src/extensions/ngram/Makefile config.status: creating src/extensions/pdt/Makefile config.status: creating src/extensions/python/Makefile config.status: creating src/extensions/special/Makefile config.status: creating src/script/Makefile config.status: creating config.h config.status: config.h is unchanged config.status: creating src/include/fst/config.h config.status: src/include/fst/config.h is unchanged config.status: executing depfiles commands config.status: executing libtool commands make gave some warnings eg: ./../../include/fst/string.h:207:59: warning: 'TokenType' is deprecated: Use fst::StringTokenType [-Wdeprecated-declarations] const SymbolTable *syms = nullptr) Then checks fails root@tiger21 openfst-1.6.3# make check Making check in src make[1]: Entering directory /tmp/openfst-1.6.3/src' Making check in include make[2]: Entering directory /tmp/openfst-1.6.3/src/include' make[2]: Nothing to be done for check'. make[2]: Leaving directory /tmp/openfst-1.6.3/src/include' Making check in lib make[2]: Entering directory /tmp/openfst-1.6.3/src/lib' make[2]: Nothing to be done for check'. make[2]: Leaving directory /tmp/openfst-1.6.3/src/lib' Making check in script make[2]: Entering directory /tmp/openfst-1.6.3/src/script' make[2]: Nothing to be done for check'. make[2]: Leaving directory /tmp/openfst-1.6.3/src/script' Making check in bin make[2]: Entering directory /tmp/openfst-1.6.3/src/bin' make[2]: Nothing to be done for check'. make[2]: Leaving directory /tmp/openfst-1.6.3/src/bin' Making check in test make[2]: Entering directory /tmp/openfst-1.6.3/src/test' make fst_test weight_test algo_test_log algo_test_tropical algo_test_minmax algo_test_lexicographic algo_test_power make[3]: Entering directory /tmp/openfst-1.6.3/src/test' make[3]: fst_test' is up to date. make[3]: weight_test' is up to date. make[3]: algo_test_log' is up to date. make[3]: algo_test_tropical' is up to date. make[3]: algo_test_minmax' is up to date. make[3]: algo_test_lexicographic' is up to date. make[3]: algo_test_power' is up to date. make[3]: Leaving directory /tmp/openfst-1.6.3/src/test' make check-TESTS make[3]: Entering directory /tmp/openfst-1.6.3/src/test' make[4]: Entering directory /tmp/openfst-1.6.3/src/test' FAIL: fst_test FAIL: weight_test FAIL: algo_test_log FAIL: algo_test_tropical FAIL: algo_test_minmax FAIL: algo_test_lexicographic FAIL: algo_test_power make[5]: Entering directory /tmp/openfst-1.6.3/src/test' make[5]: Nothing to be done for all'. make[5]: Leaving directory /tmp/openfst-1.6.3/src/test' ======================================================================== Testsuite summary for OpenFst 1.6.3 ======================================================================== # TOTAL: 7 # PASS: 0 # SKIP: 0 # XFAIL: 0 # FAIL: 7 # XPASS: 0 # ERROR: 0 ======================================================================== See src/test/test-suite.log Please report to help@www.openfst.org ======================================================================== make[4]: * [test-suite.log] Error 1 make[4]: Leaving directory /tmp/openfst-1.6.3/src/test' make[3]: * [check-TESTS] Error 2 make[3]: Leaving directory /tmp/openfst-1.6.3/src/test' make[2]: * [check-am] Error 2 make[2]: Leaving directory /tmp/openfst-1.6.3/src/test' make[1]: * [check-recursive] Error 1 make[1]: Leaving directory /tmp/openfst-1.6.3/src' make: * [check-recursive] Error 1 root@tiger21 openfst-1.6.3# ### KyleGorman - 2017-09-08 - 11:17 That warning is harmless. No clue about the errors in the tests. Can you run them manually and catch the first error they raise? (They're just executables.) ### KyleGorman - 2017-09-11 - 10:23 BTW, the last I checked, pyfst doesn't support any version of OpenFst released in the last few years anyways, so you should check on their documentation for the supported version. Log In ## Pre-initialized composition ### AhnK - 2017-08-31 - 22:07 Hi, The paper "Pre-initialized Composition for Large-Vocabulary Speech Recognition (Cyril Allauzen, et al, 2013)" mentioned the pre-initialized composition was implemented in the OpenFst toolkit. How can I use this feature? I can't find any explicit explanations on this. Log In ## trouble installing Pynini on Mac OS X ### AnnaKazantseva - 2017-08-21 - 17:34 Hello, I am having a bit of trouble installing Pynini on a Mac OS X (El Capitan). I installed OpenFST and re2 from sources and I am using Anaconda python 2.7. When I run “sudo python setup.py install” everything goes smoothly. But when I run “python setup.py test” I keep getting this error: <verbatim> sudo python setup.py test running test running egg_info writing pynini.egg-info/PKG-INFO writing top-level names to pynini.egg-info/top_level.txt writing dependency_links to pynini.egg-info/dependency_links.txt reading manifest file 'pynini.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'pynini.egg-info/SOURCES.txt' running build_ext copying build/lib.macosx-10.6-x86_64-2.7/pywrapfst.so -> copying build/lib.macosx-10.6-x86_64-2.7/pynini.so -> Traceback (most recent call last): File "setup.py", line 88, in <module> test_suite="pynini_test") File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/distutils/core.py", line 151, in setup dist.run_commands() File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/distutils/dist.py", line 953, in run_commands self.run_command(cmd) File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/site-packages/setuptools/command/test.py", line 172, in run self.run_tests() File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/site-packages/setuptools/command/test.py", line 193, in run_tests testRunner=self._resolve_as_ep(self.test_runner), File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/unittest/main.py", line 94, in init self.parseArgs(argv) File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/unittest/main.py", line 149, in parseArgs self.createTests() File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/unittest/main.py", line 158, in createTests self.module) File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/unittest/loader.py", line 130, in loadTestsFromNames suites = [self.loadTestsFromName(name, module) for name in names] File "/Users/kazantsevaa/anaconda/envs/python2/lib/python2.7/unittest/loader.py", line 91, in loadTestsFromName module = __import__('.'.join(parts_copy)) File "/Users/kazantsevaa/Software/pynini-1.6/pynini_test.py", line 16, in <module> from pynini import * ImportError: dlopen(/Users/kazantsevaa/Software/pynini-1.6/pynini.so, 2): Symbol not found: __ZN3re23RE213GlobalReplaceEPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKS0_RKNS_11StringPieceE Referenced from: /Users/kazantsevaa/Software/pynini-1.6/pynini.so Expected in: flat namespace in /Users/kazantsevaa/Software/pynini-1.6/pynini.so </verbatim> Any ideas how to fix this? Has anyone else run into this? I reinstalled openFST and re2 but it does not go away. Thanks for any tips! Anna Log In ## Error installing OpenFST under Ubuntu ### KeesKoenen - 2017-08-17 - 14:02 Want to install OpenFST 1.6.2. Ran ./configure, make, make check. I get this output. Please advice. Thanks. PS Kind of a newbie at this, pls be gentle <verbatim>fst_test.o: In function fst::internal::CompactFstImpl<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::(anonymous namespace)::CustomCompactor<fst::ArcTpl<fst::TropicalWeightTpl<float> > >, unsigned short, fst::DefaultCompactStore<std::pair<int, fst::TropicalWeightTpl<float> >, unsigned short>, fst::DefaultCacheStore<fst::ArcTpl<fst::TropicalWeightTpl<float> > > >::Init(fst::Fst<fst::ArcTpl<fst::TropicalWeightTpl<float> > > const&, std::shared_ptr<fst::DefaultCompactStore<std::pair<int, fst::TropicalWeightTpl<float> >, unsigned short> >)': fst_test.cc:(.text+0x3c35f): undefined reference to fst::Int64ToStr(long, std::string*) make[3]: Entering directory '/home/kkoenen/stt/kaldi-master/tools/openfst-1.6.2/src/test' /bin/bash ../../libtool --tag=CXX --mode=link g++ -std=c++11 -o fst_test fst_test.o ../lib/libfst.la -lm -ldl libtool: link: g++ -std=c++11 -o .libs/fst_test fst_test.o ../lib/.libs/libfst.so -lm -ldl -Wl,-rpath -Wl,/home/kkoenen/stt/kaldi-master/tools/openfst-1.6.2/lib Makefile:645: recipe for target 'fst_test' failed make[3]: Leaving directory '/home/kkoenen/stt/kaldi-master/tools/openfst-1.6.2/src/test' Makefile:1054: recipe for target 'check-am' failed make[2]: Leaving directory '/home/kkoenen/stt/kaldi-master/tools/openfst-1.6.2/src/test' Makefile:358: recipe for target 'check-recursive' failed make[1]: Leaving directory '/home/kkoenen/stt/kaldi-master/tools/openfst-1.6.2/src' Makefile:414: recipe for target 'check-recursive' failed </verbatim> Same problem with openfst-1.6.3. ### KyleGorman - 2017-08-17 - 17:06 We can fix this on our side but you don't need to run "make check", just "sudo make install" should suffice. ### KyleGorman - 2017-08-18 - 15:03 Hi Kees, the error doesn't make a ton of sense to me. There is no function fst::Int64ToStr in OpenFst 1.6.3, as far as I can tell. You may want to try doing a clean install (i.e., go to where OpenFst headers and source are installed and remove them, then install again). You may want to try this outside of the Kaldi framework for installing OpenFst, since we don't maintain that, the Kaldi people do, and I don't know anything about it. Log In ## patch for failed cross-compile on openfst-1.6.3 ### BoonPangLim - 2017-08-03 - 14:02 Hi, I get an error linking when trying to compile using a cross-compiling toolchain: <pre> make[4]: Entering directory '/scratch/bplim/crossbuild2/tmp/openfst-1.6.3/src/extensions/far' /bin/sh ../../../libtool --tag=CXX --mode=link armv8rpi3-novumind-linux-gnueabihf-g++ -std=c++11 -o farcompilestrings farcompilestrings.o libfstfarscript.la ../../script/libfstscript.la ../../lib/libfst.la -lm -ldl libtool: link: armv8rpi3-novumind-linux-gnueabihf-g++ -std=c++11 -o .libs/farcompilestrings farcompilestrings.o ./.libs/libfstfarscript.so ../../script/.libs/libfstscript.so ../../lib/.libs/libfst.so -lm -ldl -Wl,-rpath -Wl,/scratch/bplim/crossbuild2/tmp/build/rpi3/openfst/lib /home/share/crosstool/rpi3/lib/gcc/armv8rpi3-novumind-linux-gnueabihf/6.3.0/../../../../armv8rpi3-novumind-linux-gnueabihf/bin/ld: warning: libfstfar.so.8, needed by ./.libs/libfstfarscript.so, not found (try using -rpath or -rpath-link) ./.libs/libfstfarscript.so: undefined reference to fst::IsSTList(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' ./.libs/libfstfarscript.so: undefined reference to fst::IsSTTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' </pre> I'm not sure if this is completely correct but this patch (followed by rerunning aclocal && automake && autoreconf) seems to fix it: <pre> openfst-1.6.3/src/extensions/far/Makefile.am 2017-06-27 13:12:19.000000000 -0700 +++ openfst-1.6.3/src/extensions/far/Makefile.new 2017-08-02 19:24:37.034355239 -0700 @@ -23,7 +23,7 @@ bin_PROGRAMS = farcompilestrings farcreate farequal farextract farinfo farisomorphic farprintstrings -LDADD = libfstfarscript.la ../../script/libfstscript.la +LDADD = libfstfarscript.la libfstfar.la ../../script/libfstscript.la ../../lib/libfst.la -lm$(DL_LIBS)

farcompilestrings_SOURCES = farcompilestrings.cc </pre>

### BoonPangLim - 2017-08-03 - 14:05

Update with cleaner versions:

ERROR: <verbatim> make[4]: Entering directory '/scratch/bplim/crossbuild2/tmp/openfst-1.6.3/src/extensions/far' /bin/sh ../../../libtool --tag=CXX --mode=link armv8rpi3-novumind-linux-gnueabihf-g++ -std=c++11 -o farcompilestrings farcompilestrings.o libfstfarscript.la ../../script/libfstscript.la ../../lib/libfst.la -lm -ldl libtool: link: armv8rpi3-novumind-linux-gnueabihf-g++ -std=c++11 -o .libs/farcompilestrings farcompilestrings.o ./.libs/libfstfarscript.so ../../script/.libs/libfstscript.so ../../lib/.libs/libfst.so -lm -ldl -Wl,-rpath -Wl,/scratch/bplim/crossbuild2/tmp/build/rpi3/openfst/lib /home/share/crosstool/rpi3/lib/gcc/armv8rpi3-novumind-linux-gnueabihf/6.3.0/../../../../armv8rpi3-novumind-linux-gnueabihf/bin/ld: warning: libfstfar.so.8, needed by ./.libs/libfstfarscript.so, not found (try using -rpath or -rpath-link) ./.libs/libfstfarscript.so: undefined reference to fst::IsSTList(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' ./.libs/libfstfarscript.so: undefined reference to fst::IsSTTable(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)' collect2: error: ld returned 1 exit status

</verbatim>

PATCH: <verbatim>

openfst-1.6.3/src/extensions/far/Makefile.am 2017-06-27 13:12:19.000000000 -0700 +++ openfst-1.6.3/src/extensions/far/Makefile.new 2017-08-02 19:24:37.034355239 -0700

@@ -23,7 +23,7 @@ bin_PROGRAMS = farcompilestrings farcreate farequal farextract farinfo farisomorphic farprintstrings

### DanielPovey - 2016-06-29 - 19:40

BTW, this is on debian 7.10. Apparently OpenFst 1.5.3 says it should work from gcc version 4.7 which is what we are using.

### DanielPovey - 2016-06-29 - 20:25

I have found more information from valgrind and submitted a bug-report in bugzilla, I think it's a linux binutils bug.

### DanielPovey - 2016-06-29 - 21:40

I have documented in the last comment on https://sourceware.org/bugzilla/show_bug.cgi?id=20317 that the bug can be fixed by taking a certain line from a more recent binutils version 2.26 (this fix works for me; I have to compile binutils from source with the fix). So it looks like it's not a bug in OpenFst but a bug in binutils that it just happens to trigger. Our binutils is version 2.22.

Dan

## ShortestDistance very slow?

### PafnoutiT - 2016-06-21 - 11:42

I have an fst (actually a Kaldi lattice, it's an acyclic graph in the tropical semiring) on which I wanted to use fst::ShortestDistance to the final state, however it's very slow. It takes about 1.5 sec to process my lattice, whereas if I loop myself over the states and arcs to compute the same quantity it happens in 0.03 seconds. What could I have done wrong?

### KyleGorman - 2016-10-11 - 19:22

It's possible that automatic queue selection is failing because certain property bits are not yet set. You could try turning up logging to see what queue is selected (see http://www.openfst.org/twiki/bin/view/FST/FstAdvancedUsage#State_Queues), or if you're comfortable, could simply assert that the FST is (e.g.,) acyclic (and possibly topologically sorted) using SetProperties() before calling ShortestDistance. It is also possible that the explicit queue is less efficient than manual iteration for this simple case, in which case you should feel free to stick with your implementation.

## Shortest distance very slow?

### PafnoutiT - 2016-06-21 - 11:41

I have an fst (actually a Kaldi lattice, it's an acyclic graph in the tropical semiring) on which I wanted to use fst::ShortestDistance, however it's very slow. It takes about 1.5 sec to process my lattice, whereas if I loop myself over the states and arcs to compute the same quantity it happens in 0.03 seconds. What could I have done wrong?

## Python extension compilation error

### AlexBogatu - 2016-06-18 - 12:50

Hi, I am trying to use the python extension but I have some problems when I compile the source. I have done the following: - download openfst 1.5.3 - ./configure with --enable-far - make & make install - pip install openfst

This last command fails with the following error:

/usr/local/include/fst/script/fst-class.h:277:3: note: fst::script::FstClass::FstClass()

FstClass() : impl_(nullptr) {}

^

/usr/local/include/fst/script/fst-class.h:277:3: note: candidate expects 0 arguments, 1 provided

At global scope: cc1plus: warning: unrecognized command line option "-Wno-unneeded-internal-declaration" [enabled by default] error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Am I missing something? Thank you!

### MarxerR - 2016-08-10 - 10:54

I get a similar error with clang:

building 'pywrapfst' extension

clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c pywrapfst.cc -o build/temp.macosx-10.11-x86_64-2.7/pywrapfst.o -std=c++11 -Wno-unneeded-internal-declaration -Wno-unused-function

pywrapfst.cc:41102:22: error: no matching constructor for initialization of 'fst::script::FstClass'

^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/usr/local/include/fst/script/fst-class.h:281:3: note: candidate constructor not viable: no known conversion from 'const fst::script::FstClass *' to 'const fst::script::FstClass' for 1st argument; dereference the argument with *

FstClass(const FstClass &other)

^

/usr/local/include/fst/script/fst-class.h:383:12: note: candidate constructor not viable: no known conversion from 'const fst::script::FstClass *' to 'fst::script::FstClassImplBase *' for 1st argument

explicit FstClass(FstClassImplBase *impl) : impl_(impl) {}

^

/usr/local/include/fst/script/fst-class.h:279:12: note: candidate template ignored: could not match 'Fst<type-parameter-0-0>' against 'const fst::script::FstClass *'

explicit FstClass(const Fst<Arc> &fst) : impl_(new FstClassImpl<Arc>(fst)) {}

^

/usr/local/include/fst/script/fst-class.h:276:3: note: candidate constructor not viable: requires 0 arguments, but 1 was provided

FstClass() : impl_(nullptr) {}

^

1 error generated.

error: command 'clang' failed with exit status 1

### KyleGorman - 2016-10-11 - 19:24

There's no reason to call pip. The version up there is not compatible with arbitrary OpenFst versions. If you do --enable-python then make and make install, you have already installed pywrapfst for whatever version of OpenFst you're building.

## Changing Stateid value.

### JoseOrtiz - 2016-06-18 - 12:30

I'm trying to read files in GFA format and then parse into an openfst format. In GFA format, each new line starts with a character that identifies what type of line it is. What I've done so far is if it starts with H, it ignore the rest of the line and go the next one. If it starts with S, it means it's a state, so it proceeds to identify each part of the line which consist of five strings including the S. After this, I add a state, but I want my state to hold the value of one of the strings identified in the line, which is a numerical value. As I understand, when you use an AddState, it return a Stateid which begins at zero, then increments by one. If it helps, I'm currently using a StdVectorFst object for the AddState function. I am reading about the SetState function from the vector_fst_base_impl class, would I be able to use this funcion to first get the StateId I want to change and then assign the new value? If so, would this cause any errors?

Thanks.

### DoganCan - 2016-06-19 - 16:40

Hi Jose

In OpenFst, state IDs are integers. They are used for uniquely identifying states. They are not tied to the states they identify. Most OpenFst operations do not preserve state IDs.

Cheers, Dogan

### PaulPoveda - 2016-05-31 - 10:31

Hello everyone,

I tried to install OpenFst to use it with SphinxTrain and it seem to work but I got stuck with the following error :

checking fst/fstlib.h usability... no checking fst/fstlib.h presence... no checking for fst/fstlib.h... no configure: error: fst/fstlib.h header not found

But the OpenFst binaries are well installed in /usr/local/lib/fst and the header too in /usr/local/include/fst..

Can anyone give me a clue about the issue ?

### KyleGorman - 2016-10-11 - 19:26

Assuming that /usr/local/include/fst/fstlib.h exists and is readable for your user, you may need to ensure that that directory is in your include path. At the compiler level you will want to use the -I flag to pass the directory, but I'm not sure how to do this with SphinxTrain. There may be something wrong with your environment (this directory should very much be in your include path) or with the SphinxTrain configuration setup, but this doesn't seem to be an issue with OpenFst per se.

## ArcIterator construction is very slow with delayed composition

### RyanPrenger - 2016-05-03 - 16:04

Hi. I've written a decoder loosely based on Kaldi's simple-decoder that is reasonably fast on an FST that was composed offline (lex + language model). When I try using the decoder on the FST created with delayed composition it's around 100 times slower. All the time seems to be spent in the ArcIterator construction steps. The language model is around 500 megs, and the lex is around 15 megs. Would one expect such a large slow down? How do people usually get decoders to perform with a large language model (the composition results in an FST about double the size)?

### KyleGorman - 2016-10-11 - 19:34

It could be quite slow to work with a delayed composition, particularly if the offline composition is slow. The former is used to implement the latter, and many tricks which speed up offline composition should make working with a delayed composition faster too. These include use of specific matchers (or modifying the inputs, e.g. with arc-sorting or setting the FST's properties) so that ComposeFst chooses the best matcher), state tables, and composition filters (see http://www.openfst.org/twiki/bin/view/FST/FstEfficiency), and selecting the best associativities for cascaded compositions. You might just try to make the offline composition fast and then port that solution to the delayed composition-based decoder.

### EstherJudd - 2016-04-12 - 04:28

I have created a large FST for g2p. Using this command I have been able to get some speed savings when loading. I apply this command in a shell script when building the FST. <br> <verbatim>fstconvert --fst_type=const --fst_align=true in.ofst out.ofst</verbatim> <br> I have also been able to reduce the size of my FST by creating a compressed version using: <br> <verbatim>fstcompress --gzip=true out.fst out_compr.fst</verbatim> <br> Now I read in my final FST in c++ but I use the StdVectorFst. I was reading that it is faster to use StdConstFst but when I use that, I get an error saying that that class is not derived from MutableFst. I use the following command to read in the compressed FST: <br> <verbatim> StdVectorFst *decompressed = new StdVectorFst(); Decompress(remap_fst_name,decompressed,true); </verbatim> <br> So my question is, how can I increase the load speed for this large FST?

### MichaelRiley - 2016-05-27 - 10:11

fstcompress is for small file footprint; expands on load and is not fast. constfst is a bit faster to load than vectorfst not memory-mapped and much faster memory-mapped (but then is demand paged as used). you can't mutate a constfst.

### EstherJudd - 2016-06-14 - 04:11

Thanks, I guess my question is if I convert a VectorFST to a constFST using the fstconvert command, what is the best way to load it faster in c++? I found this website that talks about this conversion and when I type fst-info it does say it is memory-mapped: http://www.lvcsr.com/openfst-133-mapped-files.html

And if I did want to make the resulting FST smaller without losing accuracy, what other things can I do besides fstcompress?

### KyleGorman - 2016-10-11 - 19:38

One generic method for reducing the size (in terms of arcs and states) is to use the Optimize method in optimize.h (a header shipped with Thrax and Pynini). It merges identically-labeled multi-arcs, removes epsilons, and applies determinization and minimization where they are known to terminate. See the documentation there for the specific details. It does about as well as one can hope to do automatically without knowing how the FST will be used.

## Possible bug in ReplaceFstMatcher?

### GautamT - 2016-02-25 - 22:59

ReplaceFstMatcher in replace.h instantiates MultiEpsMatcher for each of the component fst with mode kMultiEpsList and non terminal labels as possible epsilon. It does this for both MATCH_INPUT and MATCH_OUTPUT case. Under the condition ofMATCH_INPUT, this seems wrong to me. Let's look at the 2 possible scenarios under MATCH_INPUT.

1. epsilon on replace is false: Under this condition input labels continue to remain and do not become epsilons. Input symbols could represent a domain different from output symbols and therefore input labels with values matching those of the non terminals shouldn't be considered epsilons. Replace operation takes place based on the output label and not input label. Current behaviour would match self loop (kNoLabel) against these. Dropping this behaviour would also result in performance gain since one would not have to do a Find on the underlying machine for each of the non terminal label to match against self loop.

2. epsilon on replace is true: Under this condition, input label accompanying a non terminal output label is always re-written to epsilon irrespective of whether or not its value is also that of a non terminal. Matching self loop against input in this case is tricky because one would need to match against corresponding output label. To begin input labels are sorted to use this matcher, to match against output labels that are not sorted does not seem to be a good idea.

My thoughts are that for MATCH_INPUT if epsilon on replace is false then we don't need to use MultiEpsMatcher for each underlying machine, or not perform AddMutlipleEpsLabel. If epsilon on replace is true then not use this matcher and the user wouldn't be able to take advantage of the option to not always cache. The other possible option would be to mandate that if epsilon on replace is true then input label match that of output label for non terminal arc and check this on detecting a non terminal label in the output arc.

To confirm to myself that this is a bug, I constructed a root fst and a sub fst and created a replace fst that is used for a composition with another fst. The replace fst is the rhs fst or fst2 so that it is matched on the input side. The fsts are created in such a way as to expose this problem. The result was that if one were to not do a lazy replace, i.e. fully expand it and then use it for composition, one would get a different result from using a lazy replace or replace fst. For reproducibility, here are the fsts:

root fst 0 1 1 5 1 sub fst 0 1 5 7 1

fst1 or lhs fst for composition 0 1 1 1 1 expanded fst (fstreplace root 10 sub 5 expanded) using 5 as the non terminal to refer to sub fst and root label being a non existent label (non recursive) 0 1 1 0 1 2 5 7 2 3 0 0 3

As can be seen the composition fst1 o expanded fst would result in an empty fst. However when not expanding and doing the composition it would result in a non empty fst with a final state since ReplaceFstMatcher would treat the input label 5 in the replace fst from the sub fst as epsilon and therefore composition proceeds.

Am I missing something?

### GautamT - 2016-02-25 - 23:01

Somehow the fst formatting was off, fixing it.

root fst

0 1 1 5

1 sub fst 0 1 5 7 1

fst1 or lhs fst for composition

0 1 1 1

1

expanded fst (fstreplace root 10 sub 5 expanded) using 5 as the non terminal to refer to sub fst and root label being a non existent label (non recursive)

0 1 1 0

1 2 5 7

2 3 0 0

3

## G.fst in ngram format work fail on arm

### JerryLin - 2016-02-20 - 09:13

Hi all, I built G.fst in ngram format on x86 , and run on x86. it works fine I built G.fst in ngram format on arm , and run on arm, it throws bad_malloc exception. if you built G.fst in vector format on arm, and run on arm, it works fine. How to solve it ? Thank you.

### MichaelRiley - 2016-05-27 - 10:13

Possibly changed endianity when crossing platforms. Fix - don't do that.

### KyleGorman - 2016-10-11 - 19:40

Right, because IO is memory-mapped, don't expect serialized FSTs to work on a platform with a different endianity. It's a small price to pay for faster IO.

## How to compute edit distance of a given text line

### SopheaPRUM - 2016-02-18 - 20:50

Hello,

Given a text line containing mistakes (Example "AL WUYS LO W PRICES", the correct text line should be "ALWAYS LOW PRICES"). Considering that i have a lexicon and language model. How can i compute the edit distance of this given text line?

### KyleGorman - 2016-10-11 - 19:43

The last example here (http://www.openfst.org/twiki/bin/view/FST/FstExamples) involves making an edit transducer and then computing the edit distance.

The paper on Pynini (https://aclweb.org/anthology/W/W16/W16-2409.pdf) has an example of using a noisy channel model and a language model to resolve this kind of ambiguity, as well.

## TypeError: unsupported operand type(s) for -: 'int' and 'RestrictedProperty'

### MariamAAMER - 2016-02-04 - 13:01

Hi, I am trying to create a script. The part that doesn't work is the pasted below: class MyGrating(Structure): wg_W = PositiveNumberProperty(required = True) wg_L = PositiveNumberProperty(required = True) taper_L = PositiveNumberProperty(required = True) wg_W_Grating = PositiveNumberProperty(required = True) period_Grating = PositiveNumberProperty(required = True) DC_Grating = PositiveNumberProperty(required = True) n_periods_Grating = PositiveNumberProperty(required = True) width_Grating = PositiveNumberProperty(required = True) Out_wg = PositiveNumberProperty(default=1)# 1 si queremos que dibuje guia, 0 si no la queremos def define_elements(self, elems): inside the define_elements, i have a for loop, and inside it, i have todo make an arithmetic operation between two of the above defined properties. I get an error message that says: TypeError: unsupported operand type(s) for -: 'int' and 'RestrictedProperty' Somebody could help me to resolve it. Thanks to everybody

## Linking problem on 64bit arch

### DodoIvanecky - 2016-01-08 - 09:54

I am trying to create a simple application with openFST. While fst.Write is working for my simple graph, functions like RmEpsilon or Determinize are not possible to link. In the first case I am getting: undefined reference to fst::ReverseProperties(unsigned long, bool)' and it is O.K. since: uint64 ReverseProperties(uint64 inprops, bool has_superinitial); But why/from where is fst::ReverseProperties(unsigned long, bool) called, I have no idea. Any hint is appreciated.

### DodoIvanecky - 2016-01-08 - 11:22

Looks like I see the same problem with several Property function. Calling Determinize I got undefined reference to fst::DeterminizeProperties(unsigned long, bool, bool)'. Which is O.K. because: uint64 DeterminizeProperties(uint64 inprops, bool has_subsequential_label, bool distinct_psubsequential_labels)

### DodoIvanecky - 2016-01-08 - 12:33

Solved. Problem is compiler. Needs to be at least 4.8. The reported problem was with 4.6 version.

### AnnaWarry - 2016-08-27 - 19:52

If you need install the las version visit http://www.csee.ogi.edu/~sproatr/Courses/TextNorm/tutorial.html this web have a great manual.

Anna Warry Unix Administrator in https://www.good-fundraising-ideas.com/

## converting standard lattice format into and FST file

### MuhammadSalem - 2015-11-11 - 08:11

Hi i have Standard lattice format files, and i want to convert them into WFST to use them in Kaldi each file contains more than one NULL node, which were used in htk to decrease number of arcs, now if i want to convert those files to FST What should those null nodes be in the symbol file Not that i am working with the specific symbol file as input symbol file and output symbol file in order to finish the process

## Minimization for fsts with negative weights

--

I get weird results when I try to minimize a tropical weight fst with negative weights as shown below. Additionally, the minimization takes quite long (around 20 seconds) although the fst only has 2 states.

$cat bar.txt 0 1 1 1 -5 1 1 2 2 -1 1 0$ fstcompile bar.txt bar.ofst
$fstminimize bar.ofst bar.min.ofst$ fstprint bar.min.ofst
0      1      1      1      -16777220
1      1      2      2
1      16777216


As far as I understand, this fst is already minimal so fstminimize shouldn't really do anything.

If I convert all weights to positive weights, the problem vanishes

$cat baz.txt 0 1 1 1 5 1 1 2 2 1 1 0$ fstcompile baz.txt baz.ofst
$fstminimize baz.ofst baz.min.ofst$ fstprint baz.min.ofst
0      1      1      1      5
1      1      2      2      1
1


### DoganCan - 2015-09-02 - 22:29

Hi Miikka,

The minimization algorithm first applies weight pushing to normalize the distribution of weights along the paths of the input automaton. Weight pushing algorithm requires shortest distance from any state to final states to be defined. The example automaton with negative weights violates this condition due to the negative weight cycle.

You can look into Jason Eisner's WFSA minimization algorithm if you need to minimize automata with negative weights, particularly if they have negative weight cycles.

Cheers, Dogan

### MiikkaSilfverberg - 2015-09-03 - 06:59

Thanks for the explanation Dogan I'll look into Eisner's minimization algorithm. Do you happen to know about freely available implementations?

### DoganCan - 2015-09-03 - 22:34

Hi Miikka

I don't know if there is a publicly available implementation of Eisner's algorithm.

Cheers, Dogan

## Gentle Intro to Lazy FST Operations

### KennethRBeesley - 2015-08-13 - 16:18

Where can I find a gentle introduction to lazy FST operations? When and how to use them? Tradeoffs in size and performance?

### MichaelRiley - 2016-05-27 - 10:18

there is teh OpenFSt library paper ref'ed under background material topic here and there is some comments on efficiency in the efficiency topic.

## Wildcard arcs

### RussellMoore - 2015-08-10 - 09:58

Hi, sorry if this has been asked before. I am using OpenFST via the command line interface, so I want to build my FSTs as text files before compiling them. I would like to know if it's possible to add an arc that will:

• accept any as-yet unspecified symbol as input
• output the same symbol

So in my head I imagine something like:

0 1 a a [wgt_a]
0 1 * * [wgt_other]


Where the asterisks represent "any other symbol", i.e. that arc operates as a catch-all for any symbol that is not "a".

(Update 17:35 - I think what I want is a "rho" matcher. Can I specify this inside a text file? Or is there another way to do it?)

Thanks!

### DoganCan - 2015-08-11 - 22:59

Hi Russell,

In OpenFst, special symbol information is not stored in the FSTs themselves, but are handled by the FST operations that accept these symbols as arguments. Unfortunately, special symbol matching operations are not available through the command line interface of standard OpenFst binaries. What you can do instead is to use a custom FST composition binary that can handle rho symbols.

Check out https://github.com/dogancan/lexicon-unk. There you will find a tool called fstphirhocompose which allows bare bones matching of phi and rho symbols on the second FST. This tool won't do what you need out of the box since you want matched symbols to be copied to the output side. To fix that, simply change each occurrence of MATCHER_REWRITE_NEVER with MATCHER_REWRITE_ALWAYS in fstphirhocompose.cc before compiling the code. You use this tool exactly the same way you would use fstcompose, however you can also specify the labels corresponding to the phi and rho symbols in the second FST through the command line interface. Take a look at the run.sh script in that repo to see how this tool can be used to match special symbols.

Cheers, Dogan

## TopSort not working on FST

### VivekRangarajan - 2015-07-27 - 15:33

I run into a weird issue when I create a fst in C++ from a string by adding states and arcs. I have a function that does this work. When I write the FST using Write(), the final state is printed at the top. It looks like this:

5
0   1   66   66
0   2   81   81
0   3   2837   2837
1   2   10   10
1   3   749   749
1   4   930   930
2   3   66   66
2   4   324   324
2   5   864   864
3   4   8   8
3   5   146   146
4   5   30   30


fsttopsort on this fst does not sort the states either. I overcame this by just piping fstprint X | sort -k 1 -n | fstcompile

Any ideas as to why this might be happening and how it can be fixed inside the C++ code?

### DoganCan - 2015-08-03 - 20:08

Hi Vivek,

It is hard to debug your problem without seeing the code. My best guess is that you are not setting state 0 as the initial state.

Cheers, Dogan

### VivekRangarajan - 2015-08-05 - 16:38

Thanks Dogan. I resolved this sometime back and should've added a comment. Your guess was right, I was setting the SetStart to str.size() instead of 0.

## Generation of words accepted by a Finite-State Machine

### JulioCarrasquel - 2015-06-04 - 15:00

Hello,

I would like to know how to print all the words that can be accepted for a Finite-State Machine. Note that I'm using acceptors. There is a library called Lextools who does the Job, specifically using a function of that library called lexfsmstrings , but unfortunately it's not longer available free by the guys of AT&T.

### MichaelRiley - 2016-05-27 - 10:20

You could ask for a very large number of n-best from fstshortestpath.

### GaborPinter - 2016-08-01 - 06:46

> all the words... Provided the FSA/FST is acyclic (otherwise it would take somewhat longer), it is relatively easy to write a program in C++ that recursively iterates over all the arcs. With output size of around 10k per fst it was surprisingly fast.

### KyleGorman - 2016-10-11 - 19:45

Pynini (pynin.opengrm.org) i also has something called StringPaths which enumerates all paths in an acyclic FST. If you don't want to use Python, you could build something off of the paths.h header therein.

If the FST is not acyclic, it would, of course, take not just "somewhat" longer but infinitely longer

## Arcs labeled with strings

### JulioCarrasquel - 2015-06-03 - 09:51

Hi again,

I have another question. I need to print using Graphviz a FST whose arc labels could be displayed as letters. I will highly appreciate a solution for this case. Thanks

Julio

### JulioCarrasquel - 2015-06-03 - 11:22

I've already found a solution to this too. Thanks.

## Putting only one expression on arcs

### JulioCarrasquel - 2015-06-01 - 11:12

Hello,

I need to put only one expression in the arcs of my FST since I'm not<br> modelling transducers, but acceptors. <br>

I mean, for example, to put in an arc only "x" instead of "x:y". <br>

If someone has a quick solution to this, I will highly appreciate it.

Regards, <br> Julio Carrasquel <br> juliocarrasquel@gmail.com

### JulioCarrasquel - 2015-06-01 - 11:14

I forgot to mention, I'm using in my project the command line library.

### JulioCarrasquel - 2015-06-03 - 11:22

Nevermind,

I found a solution using the option --acceptor=true through the command fstdraw.

Thanks in any case.

## Expected edit distance

### RolandSchwarz - 2015-05-08 - 03:26

Hi,

I was wondering if it is possible using the library to compute the expected edit distance between two distributions over strings.

So, if X and Y are FSAs that contain a single sequence each, and T is my edit distance transducer then shortestdist(X o T o Y) over the tropical SR computes the edit distance.

(a) Now assume X and Y are probabilistic FSAs (log SR), i.e. they define a valid distribution over a set of sequences, how do I compute the expected edit distance, i.e. \sum_x \sum_y P(x) P(y) ED(x,y)?

(b) And finally, given a probabilistic FST that defines a joint distribution for X and Y, how do I compute the expected edit distance over the joint, i.e. \sum_x \sum_y P(x,y) ED(x,y)?

I guess in the case (a) the expectation SR might help? But how do I do it in the case (b)?

Any help would really be appreciated.

Thanks,

Roland

### DoganCan - 2015-05-11 - 19:58

Hi Roland,

There is an algorithm by Mohri for computing expected edit distance between weighted automata. See this paper for details.

I implemented a slightly modified version of this algorithm some time ago. I just put together a simple setup based on that implementation and pushed it to GitHub in case you want to take a look at it.

Cheers, Dogan

### RolandSchwarz - 2015-05-13 - 05:51

Amazing, thanks, I'll check it out!

Roland

## Assigning the symbol table to FSTs in C++

### AnuragDas - 2015-05-04 - 08:56

Hi, I am able to build a FST from a string through C++. But the symbol table is showing "None" when i do a fstinfo. I want to add symbol table like "**Byte" which gets added automatically when we do "thraxmakedep" from a grammar file with --save_symbols set = True but through c++ only.

## From weighted words to weighted phrase: how to isolate word scores at phrase level?

### FrancoisSofilvacer - 2015-04-11 - 13:43

Such an operation must be classical in any concrete usage of FST cascade (OCR, speech, ...) but I struggle to express it in OpenFST.

Take a simple example: the input is a character stream, made of words inter-spaced with blanks. A non-weighted and functional lexicon FST maps sequences of letters to words. To deal with uncertainty, a weighted and non-functional extension of it maps a sequence of letters to word/score pairs.

For handling a sequence of words, a concatenation of WFSTlex is needed, resulting in multiplying (in the weight semiring) word scores but eventually, the contribution of each word is lost: the only score available encompasses the entire input.

So a language model FST, further down in the cascade has no access to individual word scores, preventing it to compute a context-dependent weighted average for example, associating a word x with a different coefficient depending on it preceding y or z.

## Compiling with mingw32

### BenoitFavre - 2015-02-26 - 07:44

In case you want to compile openfst 1.4.1 with mingw (for cross-compiling it for windows from unix, for example). You need to use a mmap implementation such as mman-win32 and set LDFLAGS to use it.

make LDFLAGS=-lmman

Here is the patch:

diff -rupN old/src/lib/mapped-file.cc new/src/lib/mapped-file.cc

old/src/lib/mapped-file.cc 2014-04-30 00:15:18.000000000 +0200 +++ new/src/lib/mapped-file.cc 2015-02-26 08:53:46.339405863 +0100 @@ -20,6 +20,10 @@ #include <fcntl.h> #include <unistd.h>

+#if (defined _WIN32 || defined _WIN64 || defined WINDOWS || defined MINGW32) +#include <windows.h> +#endif + namespace fst {

// Alignment required for mapping structures (in bytes.) Regions of memory @@ -76,7 +80,13 @@ MappedFile* MappedFile::Map(istream* s, size_t pos = spos; int fd = open(source.c_str(), O_RDONLY); if (fd = -1) { +#if (defined _WIN32 || defined _WIN64 || defined WINDOWS || defined MINGW32) + SYSTEM_INFO system_info; + GetSystemInfo(&system_info); + int pagesize = system_info.dwPageSize; +#else int pagesize = sysconf(_SC_PAGESIZE); +#endif off_t offset = pos % pagesize; off_t upsize = size + offset; void *map = mmap(0, upsize, PROT_READ, MAP_SHARED, fd, pos - offset);

## Is there an existing implementation to check if one node is reachable from another in an fst ?

### KostyaK - 2015-02-14 - 17:49

Hi! I used with etalon code to do lookahead composition: <verbatim> typedef fst::StdVectorFst GRAPH; GRAPH _graph1, _graph2; //... fst::StdOLabelLookAheadFst graph1Look(_graph1); std::auto_ptr<GRAPH> graph2TrueRelabel(_graph2.Copy(true)); fst::LabelLookAheadRelabeler<GRAPH::Arc>::Relabel(graph2TrueRelabel.get(), graph1Look, true); fst::ArcSort(graph2TrueRelabel.get(), fst::StdILabelCompare()); std::auto_ptr<GRAPH> res(new GRAPH); fst::Compose(graph1Look, *graph2TrueRelabel.get(), res.get()); </verbatim> It's work great, but i want to do lookahead composition without PushWeightsComposeFilter for example. First, I try to repeat lookahead composition without class wrappers: <verbatim> typedef fst::StdVectorFst GRAPH; GRAPH _graph1, _graph2; //... typedef fst::LabelLookAheadMatcher<fst::SortedMatcher<GRAPH>, fst::olabel_lookahead_flags> LOOK_MATCHER; typedef fst::SortedMatcher<GRAPH> SORTED_MATCHER; typedef fst::SequenceComposeFilter<LOOK_MATCHER, SORTED_MATCHER> SEQ_FILTER; typedef fst::LookAheadComposeFilter<SEQ_FILTER, LOOK_MATCHER, SORTED_MATCHER, fst::MATCH_OUTPUT> LOOK_COMPOSE_FILTER; typedef fst::PushWeightsComposeFilter<LOOK_COMPOSE_FILTER, LOOK_MATCHER, SORTED_MATCHER, fst::MATCH_OUTPUT> PUSH_WEIGHTS_FILTER; typedef fst::PushLabelsComposeFilter<PUSH_WEIGHTS_FILTER, LOOK_MATCHER, SORTED_MATCHER, fst::MATCH_OUTPUT> PUSH_LABELS_FILTER;

typedef PUSH_LABELS_FILTER COMPOSE_FILTER;

// My compose options fst::CacheOptions cacheOptions; fst::ComposeFstImplOptions<LOOK_MATCHER, SORTED_MATCHER, COMPOSE_FILTER> composeOptions(cacheOptions); composeOptions.matcher1 = new LOOK_MATCHER(_graph1, fst::MATCH_OUTPUT);

//Relabel _graph1 such as fst::StdOLabelLookAheadFst graph1Look(_graph1); LOOK_MATCHER::MatcherData* matcherData = composeOptions.matcher1->GetData(); fst::LabelReachable<Arc> graph1Reachable(matcherData); std::auto_ptr<GRAPH> graph1Relabeled(&_graph1); graph1Reachable.Relabel(graph1Relabeled.get(), false); graph1Relabeled->SetInputSymbols(NULL);

//Relabel _graph2 such as fst::LabelLookAheadRelabeler<GRAPH::Arc>::Relabel(graph2TrueRelabel.get(), graph1Look, true); fst::LabelReachable<Arc> graph2Reachable(matcherData); std::auto_ptr<GRAPH> graph2Relabeled(&_graph2); graph2Reachable.Relabel(graph2Relabeled.get(), true);

fst::ArcSort(graph2Relabeled.get(), fst::StdILabelCompare());

composeOptions.matcher2 = new SORTED_MATCHER(*graph2Relabeled.get(), fst::MATCH_INPUT); composeOptions.filter = new COMPOSE_FILTER(*graph1Relabeled.get(), *graph2Relabeled.get(), composeOptions.matcher1, composeOptions.matcher2);

std::auto_ptr<GRAPH> composedRes(new GRAPH); *composedRes = fst::ComposeFst<Arc>(*graph1Relabeled.get(), *graph2Relabeled.get(), composeOptions); </verbatim>

But result of my lookahead composition code is not equal to result of etalon code: my graph is much smaller than etalon. What I'm doing wrong? May be exists other way?

### KostyaK - 2015-02-14 - 17:57

<verbatim> test </verbatim>

### KostyaK - 2015-02-14 - 18:08

Text formatting don't work?

## fstshortestpath --unique?

### GezaKiss - 2015-02-10 - 14:50

I used this command, expecting it to produce all unique output strings - but it produced only one. The FST I tried it on contained just the two possible pronunciations of the orthographic form "record", and nothing else (here '}' separates the input and output symbols): r}9r e}E c}k o}3r r}<epsilon> d}d r}9r e}I c}k o}oU r}9r d}d

My questions: 1) What is the meaning of the --unique parameter? 2) How can I get the behavior I want using OpenFst (i.e. all the possible output strings)?

Thanks! Géza

### KyleGorman - 2015-09-03 - 16:20

Hi Géza, I assume you've figured this out by now, but just in case: --unique simply indicates that the resulting shortest path FST will contain only distinct paths, but it doesn't specify how many distinct paths will be present. How many paths are returned is controlled by the --nshortest flag, which it defaults to 1.

(Presumably the reason --unique doesn't have the semantics you expected it to is that the machine might be cyclic, in which case termination would not be guaranteed. But any suggestions to improve the documentation would be taken under consideration.)

## farcreate bug

### DoganCan - 2015-01-15 - 21:36

Hi,

I just wanted to report a small bug in farcreate. When the --file_list_input option is set to true, farcreate skips the first input file on the command line due to an off by one error. Below is the relevant fix.

Cheers, Dogan

diff -ur openfst-1.4.1.orig/src/include/fst/extensions/far/create.h openfst-1.4.1/src/include/fst/extensions/far/create.h
--- openfst-1.4.1.orig/src/include/fst/extensions/far/create.h   2014-04-25 02:55:40.000000000 -0700
+++ openfst-1.4.1/src/include/fst/extensions/far/create.h   2014-04-25 02:57:48.000000000 -0700
@@ -48,7 +48,7 @@

vector<string> inputs;
if (file_list_input) {
-    for (int i = 1; i < in_fnames.size(); ++i) {
+    for (int i = 0; i < in_fnames.size(); ++i) {
ifstream istrm(in_fnames[i].c_str());
string str;
while (getline(istrm, str))


### KyleGorman - 2015-09-10 - 19:49

Hi, thanks for that bug report. A fix should be in the next release.

## Using LabelLookAheadMatcher on input fst and PhiMatcher on output fst in LookAheadComposeFilter

### OliverW - 2014-12-10 - 16:19

Hi,

I'm trying to use the LabelLookAheadMatcher on an input fst (a lexicon transducer) and the PhiMatcher on an output fst (a custom ngram fst) in a LookAheadComposeFilter.

The setup is as follows:

// Phimatcher typedef fst::PhiMatcher<fst::SortedMatcher<fst::Fst<fst::LogArc> > > PM;

// some definitions for the lookahead matcher class
public: typedef fst::LogArc ARC;
typedef fst::Fst<ARC> FST;
typedef fst::SortedMatcher<FST> SM;
typedef fst::AltSequenceComposeFilter<OLAM, PM> ASCF;
typedef fst::LookAheadComposeFilter<ASCF, OLAM, PM, fst::MATCH_OUTPUT> LACF;
typedef fst::LabelReachableData<typename ARC::Label> LRD;
typedef fst::LabelReachable<ARC, fst::DefaultAccumulator<ARC>, LRD> Reachable;
};

// the lookahed matcher

// create relabler to be used to relabel LanguageModelFST input labels

// create phi matcher with relabeled phi symbol
PM *PM22 = new PM(LanguageModelFST, fst::MATCH_INPUT, RD.Relabel(PHI_SYMBOLID), false);

// the composition options

// do the composition
fst::ComposeFst<fst::LogArc> Input_Unk_Lex_LM(Input_Unk_Lex_OSort, LanguageModelFST, copts2);


The input labels of my language model fst are relabeled according to the relabler. Unfortunately the composition result only contains one start state.

Interestingly if I leave out the relabling of the PHI symbol and the Language model input arc, I get an almost complete composition result the only part missing are the last states wich are reached by a transition with <eps> on the output arc.

Examples for the fsts can be found in here: https://www.dropbox.com/sh/dt7jayrw8upw9dg/AACOS99CeBvnGkc2xbowQnMTa?dl=0

in_lex_lm: the composition result in_lex: the input fst (lexicon) lm: the output fst (relabeled ngram) lm.orig: lm without relabeling Relabeling.txt: the used label mapping

I would be happy if someone can help me with my problem. Is it possible that the phi matcher dois not work with the lookahead? I could not get my head around it but thinking about it, it seems quite complicated. How does the lookahed for the phi label behave?

Oliver

## Special arcs from the command line?

### StevenBedrick - 2014-12-01 - 17:34

Hello-

Is there any way to use phi, rho, or sigma arcs with the OpenFST command-line tools? I know it's possible using the C++ API, but it'd be lovely to be able to use them as easily as we currently use epsilon arcs with fstcompile and friends.

-SB

## Error when compile openfst 1.4.1

### NguyenHieu - 2014-11-23 - 23:49

I have compiled openfst 1.4.1 on ubuntu 13.10_64bit, every thing done well but when I use command: "fstinfo --help", it annouce error: "fstinfo: error while loading shared libraries: libfstscript.so.0: cannot open shared object file: No such file or directory". Anyone can help me to solve this problem

### DoganCan - 2014-11-24 - 21:27

Hi Nguyen,

Did you do the "make install" step? That would explain why this library can not be found on your system even if you did not specify a custom installation prefix during configuration.

Alternatively, if you specified an installation prefix during configuration, you may need to add that path to LD_LIBRARY_PATH.

Hope these help.

Cheers, Dogan

## Minimizing Transducers

### AnA - 2014-11-21 - 12:21

Is there a way to minimize a finite state transducer, s.t. the minimized transducer has the same input/output behavior as the original transducer?

Example:

0 1 1 1
0 1 0 1
1 2 0 0
1 2 1 0
2 1 1 0
2 1 0 0

(a.fsm)

For this example I would expect to get the following minimized transducer:

0 1 0 1
0 1 1 1
1 1 1 0
1 1 0 0


If I run

  fstcompile < a.fsm > b.fsm
fstminimize b.fsm > c.fsm

I get an empty transducer instead.

### DoganCan - 2014-11-24 - 21:10

Hi,

The reason you are getting an empty transducer is because your input transducer does not have any final states.

Looking at the minimization result you are expecting, I'm guessing your input should look like the following.

0 1 1 1
0 1 0 1
1 2 0 0
1 2 1 0
1
2 1 1 0
2 1 0 0
2


Cheers, Dogan

## Determinize. Is this a bug ?

### OlivierB - 2014-10-06 - 06:50

Hi,

When I determinize this small (5 states, 7 arcs) Finite state recognizer, the resulting finite state machine is made of 48653930 states and 72980894 arcs and use 1.7 G of disk space.

Is this really the expected behaviour ? If so, can someone give me an explanation for this ? Thanks, Olivier

<verbatim> 0 1 1 1 10 0 3 1 1 10 1 2 2 2 10 1 2 1 1 1 3 1 1 1 3 4 2 2 4 3 1 1 </verbatim>

### OlivierB - 2014-10-06 - 06:59

New try, here is the input FST, with hopefully better formating :

0 1 1 1 10 <br>

0 3 1 1 10 <br>

1 2 2 2 10 <br>

1 <br>

2 1 1 1 <br>

3 1 1 1 <br>

3 4 2 2 <br>

4 3 1 1 <br>

### CyrilAllauzen - 2014-12-01 - 17:35

This is indeed the expected behaviour. In theory, the algorithm should not terminate but it converges due to floating point issues. The reason for that is that your machine does not have the twins property so there is no guaranty that the algorithm terminates. You can read more about this in the references on the determinize page. If you are really interested in the subject, you can also look at Daniel Kirsten's paper: your machine is polynomially ambiguous and does not have the clone property, so the determinization algorithm will not terminate.

### KennethRBeesley - 2015-08-28 - 18:15

It would be very convenient (for me at least) if the OpenFst library included a boolean HasTwinsProperty(A) function that would return true if and only if FST A had that property. Before trying to determinize any FST with cycles, OlivierB and I could simply call HasTwinsProperty(A) to make sure that the algorithm would terminate. (I.e., I'm not suggesting that OpenFst itself should routinely call HasTwinsProperty(A) and store the result in the properties of each FST. I know that it's a rather expensive test to perform. Rather, it would just be available for end-users like me to call explicitly when we need to know.)

## FST Permute ?

### SriramVenkatapathy - 2014-08-20 - 11:04

I remember using a python library sometime in 2009 (it was the AT&T version i think) which had a function to permute an FST given the window of permutation. I don't find it in OpenFst, neither can I find again the AT&T library that had this function. Does any have any pointers ?

## Problems using farcompilestrings and far_reader?

### EstherJudd - 2014-08-09 - 13:11

I am trying to create pronunciations for a long list of words to evaluate the g2p module. I figured the fastest way to do it is to create a FAR archive of all the words and then in c++ compose each FST in the FAR archive with the trained g2p model FST. But I am running into trouble with extracting each FST in c++. It says it finds a bad FST header <unspecified>.

Does anybody have similar code that I can derive from? Or tell me the correct flags to compile the FAR archive? Right now I use farcompilestrings --symbols=en.isyms --keep_symbols=1 test.txt test.far

The c++ code is borrowed from /usr/local/include/thrax/evaluator.h

### EstherJudd - 2014-08-17 - 02:01

I got a little bit further here because I found out I forgot to include /usr/local/include/libfstfarscripts.so as a linker file (I did hav libfstfar.so but that wasn't enough. Now I am stuck with the call to the function ReassignSymbols which is defined in thrax/evaluator.h. It says it has not been declared in the scope of main(). I added #include <thrax/evaluator.h> to the .cc file but that is not enough. What is the best way in c++ to include a function defined in a header file like that?

### PaulDixon - 2014-08-18 - 06:33

How about this? https://gist.github.com/edobashira/8183f3ef2e7431345241 It composes the model FST with every FST in the input FAR file and writes the output to another FAR file. For farcompilestrings I just used the symbols flags and the program assumes none of the FSTs have symbols tables and everything is in the tropical semiring.

I'm compiling on OSX 10.9 so you might need slightly different compiler switches for your platform

### EstherJudd - 2014-08-19 - 14:08

Thanks, I was able to compile the program on Ubuntu by slightly changing the Makefile. I have one question though, how do you do it with symbol tables? The input far has only letters and the model fst goes from letters to phonemes, so the output is phonemes.

### PaulDixon - 2014-08-19 - 15:22

If the symbol table in the far file and the letter symbol table of the g2p model are the same then compose won't throw an error and everything should just work.

### EstherJudd - 2014-08-19 - 15:45

The problem is that when you compile a far file, the symbol table only is added to the first FST and you have to make sure you transfer it to the rest. I have been trying to use code from thrax/evaluator.h to use a function called ReassignSymbols but haven't figured out quite how to pull that off yet.

### PaulDixon - 2014-08-20 - 02:05

OK, how about just compiling the far --keep_symbols=false and compile the model --keep_isymbols=false

### EstherJudd - 2014-08-20 - 03:02

I will give that a try.

### EstherJudd - 2014-08-20 - 07:13

That was it! Thanks so much for your help. Now I just have to figure out how to extract the output strings into a text file but that shouldn't be too hard.

## fstcompile to fstdraw

### AndreSchlichting - 2014-07-25 - 11:09

Is it possible to compile to draw a fst that has this format?

0   1   3005   0   4.71582,57,115
0   2   3783   0   0,57,115
0   3   0      1   -2.39746,71,347
...


I mean because of the weights: Tks.

### DoganCan - 2014-07-25 - 16:21

Hi Andre,

Maybe something like this?

Cheers, Dogan

## Double Weighted arcs with composition lookahead filter

### DominicHowell - 2014-07-25 - 06:00

Hi I am using the additional lattice4 library to provide two-weighted arcs. It works great but I am having a problem using the composition lookahead filter on WFSTs of the lattice4 arc type. I was just wondering if anyone had ever used composition filters on WFSTs with user-defined weights? Any help would be much appreciated.

### PaulDixon - 2014-07-29 - 16:09

Try this (taken from matcher-fst.h)
typedef MatcherFst<ConstFst<KaldiLatticeArc>,
FastLogAccumulator<KaldiLatticeArc> >,


And for shell tools building this into the shared object (from extensions/lookahead/olabel_lookahead-fst.cc) compile this into the lattice4 shared object code

 static FstRegisterer<KaldiLatticeOLabelLookAheadFst>


### AlexZatv - 2014-07-22 - 16:21

Hello! I am using Determinize on VectorFst, in multi-threaded program, in OpenFst port for Windows. But, unfortunately, sometimes it fails. May be someone here have experience in multithreading in OpenFst and can help?

## Deleting a specific arc in an FST

### VinodhR - 2014-06-25 - 17:55

Hi,

I am iterating all the arcs of an FST.


for(StateIterator<VectorFst <StdArc>>; siter(fstnew); !siter.Done(); siter.Next()) {
for(MutableArcIterator<VectorFst <StdArc> > aiter(&fstnew,siter.Value()); !aiter.Done(); aiter.Next()) {
StdArc arc = aiter.Value();

<<delete current arc>>
}
}



I see that DeleteArcs has been overloaded with an additional parameter n. What does this n refer to ?

How do I delete the current arc that is being iterated ? or for that matter any specific arc in an FST ?

V

### DoganCan - 2014-06-25 - 21:06

Hi Vinodh,

OpenFst does not provide a convenience method for deleting specific arcs from a VectorFst. I believe this has to do with the fact that these arcs are kept in a std::vector and the erase operation is inefficient for this data type, i.e. erasing an entry from a std::vector is linear in the size of the vector. As a workaround, you can construct a new fst, which includes only the arcs you want to keep, while iterating through the arcs of the original.

You can use DeleteArcs to delete all arcs at a state or the last n arcs added to a state.

Cheers, Dogan

## Using fstreplace after fst composition

### AndreSchlichting - 2014-06-24 - 16:07

Is it possible to use fstreplace tool in FSTs after composing them with some other FSTs?

For example, I have a grammar G with some symbols I want to replace later in it, let's say, S1 and S2. I have also a lexicon L. The symbols S1 and S2 have two other grammar FSTs GS1 and GS2.

I am trying:

fstcompose L G > LG

fstcompose L GS1 > LGS1

fstcompose L GS2 > LGS2

fstreplace LG 'label' LGS1 'label1' LGS2 'label2' > LG1

But this is getting a lot different if I do:
fstreplace G 'label' GS1 'label1' GS2 'label2' > G2

fstcompose L G2 > LG2

LG2 is correct, I can use it for my purposes, but I can't make it right the first way, any idea why?

### DoganCan - 2014-06-25 - 03:37

Hi Andre,

There were a few inconsistencies in your example commands. I edited your question and tried to fix them to the best of my understanding. Please check the question and confirm that the example commands match your case.

The most likely reason why LG1 and LG2 differ is because G includes arcs labeled with S1 and S2 while L does not. When you compose L with G, the paths in G which contain S1 and S2 do not match any path in L. I believe you can handle these symbols by adding two simple loops to L, e.g. assuming 0 is both the initial state and a final state of L, you would add

0 0 <eps> S1
0 0 <eps> S2

These loops will match S1 and S2 during composition, hence preserve the paths which include these symbols in LG.

Cheers, Dogan

### AndreSchlichting - 2014-06-25 - 08:10

Actually L contains the symbols S1 and S2 exactly like you sugested.

0   1   <eps>   <eps>   0.693147182
0   2   sil   <eps>   0.693147182
1   4   <eps>   S1
1   5   <eps>   S2
1   1   sil   _silence_
1   6   k   w6
1   12   d   w3
1   1   o^   w1   0.693147182
1   3   o^   w1   0.693147182
1   15   g   w5
1   18   t   w2
1   21   t   w4
1   1   #0   #0
1
2   1   #3   <eps>
3   2   sil   <eps>
4   1   #1   <eps>   0.693147182
4   3   #1   <eps>   0.693147182
5   1   #2   <eps>   0.693147182
5   3   #2   <eps>   0.693147182
6   7   a   <eps>
7   8   x   <eps>
8   9   o^   <eps>
9   10   R   <eps>
10   11   o   <eps>
11   1   S   <eps>   0.693147182
11   3   S   <eps>   0.693147182
12   13   o^   <eps>
13   14   i   <eps>
14   1   z   <eps>   0.693147182
14   3   z   <eps>   0.693147182
15   16   a'   <eps>
16   17   t   <eps>
17   1   S   <eps>   0.693147182
17   3   S   <eps>   0.693147182
18   19   e^   <eps>
19   20   N   <eps>
20   1   o_   <eps>   0.693147182
20   3   o_   <eps>   0.693147182
21   22   r   <eps>
22   23   e^   <eps>
23   1   S   <eps>   0.693147182
23   3   S   <eps>   0.693147182


The symbols S1 and S2 will still be in LG after composition because I want to replace them later, isn't it correct? But in fact GS1 and GS2 do not contain the symbols S1 and S2. They are instead what I want to use to replace later.

G

0   1   #0   <eps>   227.955917
0   4   w1   w1
1   2   S1   S1   1.60943794
1   3   S2   S2   1.60943794
1   4   w1   w1   1.60943794
1   5   w2   w2   1.60943794
1   1.60943794
2   1   #0   <eps>   227.955917
2
3   1   #0   <eps>   227.955917
3   2   S1   S1
4   1   #0   <eps>   227.955917
4   5   w2   w2
5   1   #0   <eps>   227.955917
5   3   S2   S2


GS1

0   1   #0   <eps>   16.534359
0   2   w6   w6   0.693147182
0   3   w5   w5   0.693147182
1   2   w6   w6   1.38629436
1   3   w5   w5   1.38629436
1   0.693147182
2   1   #0   <eps>   227.955917
2
3   1   #0   <eps>   227.955917
3


GS2

0   1   #0   <eps>   16.534359
0   2   w3   w3   0.693147182
0   3   w4   w4   0.693147182
1   2   w3   w3   1.38629436
1   3   w4   w4   1.38629436
1   0.693147182
2   1   #0   <eps>   227.955917
2
3   1   #0   <eps>   227.955917
3


By the way, I use the option epsilon_on_replace with fstreplace.

Thanks.

### DoganCan - 2014-06-25 - 20:36

Hi Andre,

The difference is due to the structure of the lexicon. There are two issues as far as I can tell.

1. The loops for handling S1 and S2 symbols are introducing extra disambiguation and silence symbols on the input side. As a result, LG1 will always include these extra symbols on its input side. On the other hand, G2 does not include S1 and S2 symbols. Hence these dummy loops are never traversed while composing L with G2. Make sure these dummy paths do not introduce extra symbols on the input side.
2. An optional silence is allowed at the beginning. These optional silences will make their way through to LGS1 and LGS2 and create additional silence symbols on the input side of LG1. You might want to use a slightly different lexicon, which does not allow these extra silences at the beginning, while constructing LGS1 and LGS2.

Cheers, Dogan

## Phi-matcher in PDT

### AntoineRaux - 2014-06-23 - 20:04

Is it possible to use specialized matchers in a PDT? Specifically, I want to create a PDT from an RTN (using pdtreplace) where some of the FSTs in the RTN have phi transitions. Is this possible with the current version of the pdt extension, and if so how?

### BenSnyder - 2014-08-14 - 12:43

I've been grappling with this questions too. I've created a PhiParenMatcher class (inheriting from PhiMatcher with the <M> template specialized to be a PDT ParenMatcher). The main problem is that the phi match fails when the destination state of the phi arc doesn't have an explicit match to the requested match_label, but does have a non-consuming match. This commonly happens in a PDT because the destination state has outgoing arcs labelled with parentheses, which are non-consuming and treated analogously to epsilons.

In fact, I think this is a more general problem with the PhiMatcher, even with a standard FST -- the phi match doesn't work properly when the destination state of the phi arc has an outgoing epsilon arc. I will update if I come up with a solution.

## Encoding and composing a character-to-token lexicon transducer with UNKs

### AaronDunlop - 2014-06-19 - 17:12

Is there a relatively standard or straightforward way to build a lexicon transducer (i.e., a character-to-token transducer) which handles unknown words, outputting an appropriate UNK symbol? The lexicon examples at http://www.openfst.org/twiki/bin/view/FST/FstExamples are pretty straightforward, and we can extend the lexicon script to construct a trie transducer for an arbitrary corpus (and handle spaces as word delimiters). But the transducer can't rewrite input that includes unobserved words.

My intuition is that generalizing further to handle UNKs would be similar to encoding a language model with backoff transitions (Allauzen et al., 2003). But I'm confused about the exact encoding, and how to compose the resulting transducer with other transducers in a chain, using the appropriate matchers (my guess is we want either a phi or a rho matcher, but I'm not certain which).

For example, consider a system consisting of 3 transducers: * F: A transducer which passes through all characters unchanged, or collapses sequences of repeated characters at some cost. (Note that this transducer is not deterministic, since it doesn't require that a repeated sequence be collapsed).<br> * G: Character-to-token lexicon transducer<br> * H: Word-level n-gram model<br>

In theory, H(G(F(x))) would normalize input text, deleting duplicated characters as appropriate (e.g., if the character-deletion costs and language model probabilities balance properly, it might change 'pair off access' to 'pair of aces' but leave 'turn off access' unchanged).

F will naturally include arcs with epsilon outputs to delete duplicate characters. But it seems like those epsilon outputs aren't semantically the same as the epsilon inputs in H that encode LM backoffs, or likely to analogous epsilon inputs in G. Both compositions (F o G) and (F o G) o H will require handling backoff arcs. That means we need to use a phi or rho matcher, correct? Can we perfom these compositions with the command-line fstcompose tool? Or does phi/rho matching require custom C++ code?

Of course, we could avoid the problem by using a character-level language model instead (in which case we'd only need 2 transducers, and could let OpenGRM encode the LM backoffs), but I'd like to incorporate the longer distance dependencies of a word-level model.

If there are standard tools to do this that I'm missing, I'd love pointers. Thanks in advance.

### DoganCan - 2014-06-19 - 22:54

Hi Aaron,

I'm not quite sure about the exact transduction you are looking for. Is the following an accurate description of what you have in mind?

1. Input is a sequence of characters where word boundaries are marked with whitespace characters, i.e. there is no confusion about where one word starts and ends.
2. Characters may repeat and deleting repeated characters by paying its cost is OK.
3. There may be character sequences which can not be mapped to any word in the vocabulary even after character deletion. These should be mapped to UNK.

Could you give an example input and the desired output? Ideally something which covers all above cases.

Cheers, Dogan

### AaronDunlop - 2014-06-20 - 19:09

Dogan,

I'm specifically interested in the character-to-token (lexicon) transducer that handles unknown words. The duplicate-deletion transducer is an example of a character-level transducer I might combine with it (the transducer I'm working with currently learns character-level mutations from data, and I want to support other character transducers as well, such as something hand-coded in Thrax).

To combine any of those character transducers with a word-level language model, I need a lexicon transducer in between.

So a trivial example chain (that only handles the word 'it') might be:

1. A character-mutation transducer including i:i and t:t pass-through arcs (at 0 cost), and t:<eps> at some cost

2. A lexicon which accepts i t <space> and outputs 'it'. (sorry, I can't figure out how to include images for these transducers). If any character other than 'i' appears at the start state, it should take a failure transition to an UNK state (at some cost); similarly for any character other than 't' in the 'i' state; the UNK state would have a self-loop for any character other than <space>, and an arc for <space> back to the start, outputting <unk>.

I can create transducer 2 with explicit arcs for all characters in the vocabulary, but of course that's very large for a sizable (word) vocabulary (millions of arcs). So it's impractical to compose with the language model (And it would presumably be much larger if we handled characters outside of the English ASCII range). I know language models use failure arcs to avoid encode backoff transitions compactly. My intuition is that a lexicon with UNK support would look similar. But I haven't figured out how to do that encoding, or what matchers to use (if a specific matcher is needed) for composition and inference. I probably need to go back and read Cyril, Mehryar, and Brian's 2003 language modeling paper in more detail. But it seemed like a problem that might already have been solved.

Did that help clarify my question at all?

Thanks, Aaron

### DoganCan - 2014-06-20 - 21:54

Hi Aaron

Does the following lexicon for the vocabulary {man, mars} match what you have in mind? Note that phi matches rest of symbols but does not consume any symbols, while rho matches rest of symbols and consumes a symbol.

Cheers, Dogan

### DoganCan - 2014-06-21 - 07:05

Hi Aaron,

I just pushed an example setup to github at the address

It includes a simple script for constructing the lexicon and a rudimentary fst composition tool with phi and rho matching on the input side of the second (right-hand) input fst.

Cheers, Dogan

### EvaHasler - 2015-08-11 - 11:27

Hi Dogan

I have a question that is related to your example from above. I am using your example to set up a rho matcher and it works fine. However, when the second lattice also contains arcs with epsilons as input symbols, it does not work: FATAL: ComposeFst: 2nd argument requires matching but cannot.

How do I fix it to be able to compose with epsilons as well?

Thanks, Eva

## Compile openfst 1.4.1 on 64-bit Ubuntu VM?

### EstherJudd - 2014-06-17 - 12:26

When I try to compile openfst on Ubuntu everything seems to compile just fine. But then I try to run a program such as fstinfo and I get the following error:

fstinfo: error while loading shared libraries: libfstscript.so.1: cannot open shared object file: No such file or directory

I have tried configuring with --enable-shared on, I have tried downgrading from gcc 4.8.2 to gcc-4.4. There seems to be no difference. The libraries are installed in /usr/local/lib/fst. I have even tried adding the directory to the path in my bash login-script. What else can I do?

### PaulDixon - 2014-06-17 - 16:07

Have you tried adding /usr/local/lib/ to the LD_LIBRARY_PATH environment variable?

### EstherJudd - 2014-06-17 - 17:22

That helped. Dumb of me to not think of that. Now I'm just grappling with compiling thrax. I get an internal compiler error whenever I try.

### PravinBhosale - 2015-10-21 - 02:47

how to add LD_LIBRARY_PATH environment variable

## Is there a simple way to add weighted arcs from final nodes to initial node?

### JihoonKim - 2014-06-09 - 00:50

I have an FST which has more than 10000 final nodes. I'd like to add weighted arcs from these final nodes to initial node. Closure operation seems not to support weighted arcs. Is there any simple way to do that?

### JihoonKim - 2014-06-09 - 04:03

To clear up, I want the arcs with epsilon input/output.

### PaulDixon - 2014-06-09 - 13:32

template<class Arc>
void AddFinalStartArc(MutableFst<Arc> *fst, typename Arc::Weight weight) {
typedef typename Arc::Weight Weight;
for (StateIterator<Arc> siter(*fst); !siter.Done(); siter.Next())
if (fst->Final(siter.Value()) != Weight::Zero())
}


## Compile error OpenFST 1.4.1

### RolandSchwarz - 2014-05-22 - 08:22

Hi, I'm trying to install the latest version on Windows using Cygwin. All previous versions that I tried (last was 1.3.3) compiled without problems. In this version I get the following compile error:

/bin/sh ../../libtool --tag=CXX --mode=compile g++ -DHAVE_CONFIG_H -I./../include -std=c++0x -MT determinize.lo -MD -MP -MF .deps/determinize.Tpo -c -o determinize.lo determinize.cc libtool: compile: g++ -DHAVE_CONFIG_H -I./../include -std=c++0x -MT determinize.lo -MD -MP -MF .deps/determinize.Tpo -c determinize.cc -DDLL_EXPORT -DPIC -o .libs/determinize.o as: .libs/determinize.o: too many sections (45066) /tmp/ccUSIrLz.s: Assembler messages: /tmp/ccUSIrLz.s: Fatal error: can't write .libs/determinize.o: File too big as: .libs/determinize.o: too many sections (45066) /tmp/ccUSIrLz.s: Fatal error: can't close .libs/determinize.o: File too big Makefile:390: recipe for target 'determinize.lo' failed

Any idea what has changed that it doesn't compile anymore? From what I understand there is little I can do on my side to make it work? Any help appreciated as usual.

Thanks,

Roland

### PaulDixon - 2014-05-24 - 14:59

I saw a similar error about too many segments when compiling version 1.4.1 on Visual Studio. In VS there is a switch /bigobj that worked for me but there doesn't seem to be an equivalent for Cygwin. You could try compiling without debugging information to try reduce the file size. I'm not familiar with Cygwin but would using a 64bit compiler help? http://mingw-w64.sourceforge.net/

You also try using the VS compiled version, I put a link to the solution in the contrib section. They should run fine under Cygwin

### DanielPovey - 2014-05-24 - 15:38

Someone reported on the Kaldi forum that if you configure with --enable-static --disable-shared , it works.

### DanielPovey - 2014-05-28 - 15:00

Kaisheng Yao posted this to kaldi-users:

I found the following link http://stackoverflow.com/questions/16596876/object-file-has-too-many-sections It is related to using template. Another way to resolve this is turning on optimization. In my case I use

EXTRA_CXXFLAGS += -Wno-sign-compare -O1 in src\lat\Makefile

I therefore changed the following in src\kaldi.mk and compilation was successful after this change

CXXFLAGS = -msse -msse2 -Wall -I.. -DKALDI_DOUBLEPRECISION=0 -fPIC -DHAVE_POSIX_MEMALIGN -DHAVE_CLAPACK -I ../../tools/CLAPACK/ -Wno-sign-compare -Winit-self -I ../../tools/CLAPACK/ -I $(FSTROOT)/include$(EXTRA_CXXFLAGS) -O1

### ChristopherBader - 2014-09-19 - 15:13

Just to clarify DP's comment:

--enable-static --disabled shared does NOT work

Turning on C++ optimization DOES work.

Put CXXFLAGS="-O" at the end of the configuration command, e.g.

./configure CXXFLAGS="-O"

## Multiple score on the arc...?

### ZeeshanAhmed - 2014-04-08 - 15:41

Hi, Is it possible to define multiple scores on the arcs? e.g. acoustic and language modeling scores at the same time. Would determinization and minimization algorithms work in this situation as is?

Thanks.

### PaulDixon - 2014-04-09 - 05:09

Yes to both. This is described in the OpenFst paper http://openfst.org/twiki/pub/FST/FstBackground/ciaa.pdf. Kaldi also has a LatticeWeight type which will do exactly what you want. Here is minimal standalone example that will build an OpenFst shared object and enable the type in the shell tools. https://gist.github.com/edobashira/10244848

### ZeeshanAhmed - 2014-05-10 - 12:37

Hi Paul, I compiled the standalone version on openfst.1.3.4. However, I am getting the following error when run the following test command. Can you help me what could be the problem?
$fstcompile --arc_type=lattice4 test.txt ERROR: GenericRegister::GetEntry : lookup failed in shared object: lattice4-arc.so FATAL: No operation found for "CompileFst" on arc type lattice4 ### DoganCan - 2014-05-11 - 19:06 Hi Zeeshan, Did you build the shared object lattice4-arc.so using Paul's gist? It seems like it is not in the working directory or in the dynamic library search path LD_LIBRARY_PATH (DYLD_LIBRARY_PATH on OS X). Cheers, Dogan ### ZeeshanAhmed - 2014-05-13 - 17:57 Hi Dogan, Thanks for the reply. It worked fine on pure Linux system but on cygwin it's not working though the lattice4-arc.dll is in the working directory. The problem is that on cygwin shared libraries for openFst are not built. Any idea on how to build openFst shared libraries on cygwin? Thanks, Zeeshan ### DoganCan - 2014-05-13 - 18:26 Hi Zeeshan, I have no idea how things work under cygwin. That being said, are you sure your problem is related to shared libraries not being built? I think your problem might have to do with the shared object file being named lattice4-arc.dll instead of lattice4-arc.so. Try renaming that file to lattice4-arc.so. If that does not work, try lattice4-arc.so.dll. Good luck. Cheers, Dogan Log In ## Error installing openGRM on Mac 10.9.2 ### SepidehA - 2014-04-05 - 13:38 Hi all, I have installed OpenFST successfully on my mac 10.9.2 with --enable-far. But then when I want to configure my OpenGRM for installing I get " checking fst/fst.h usability... no checking fst/fst.h presence... no checking for fst/fst.h... no configure: error: fst/fst.h header not found" but my fst.h is there in /usr/local/include any ideas? Thanks ### CyrilAllauzen - 2014-04-07 - 14:53 Hi, It seems this is due by Apple changing the default C++ standard library in the latest version of Xcode (version 5). CXX=-stdlib=libstdc++ ./configure should fix this. ### EstherJudd - 2014-05-07 - 13:03 I have the same problem. The command above does not fix it. I had it working on a different machine previously but just started working on my iMac with 10.9.2 and the latest version of Xcode installed. ### PaulDixon - 2014-05-08 - 01:15 The latest c++11 version of OpenFst (1.4.1) and OpenGrm Ngram (1.21) should just compile without any extra flags or environment variables. ### EstherJudd - 2014-05-08 - 11:43 Then why when I run ./configure do I get the following messages? fat binaries are installed in /usr/local/bin and it is in my path checking fst/fst.h usability... no checking fst/fst.h presence... no checking for fst/fst.h... no configure: error: fst/fst.h header not found ### AaronDunlop - 2014-05-08 - 16:10 Here's what worked for me on Mavericks (10.9.2) In the extracted opengrm-ngram-1.2.1 directory: # Replace imports of 'unordered_map' with 'tr1/unordered_map' perl -pi -e 's#<unordered_map>#<tr1/unordered_map>#' src/include/ngram/ngram-shrink.h src/include/ngram/ngram-marginalize.h src/include/ngram/ngram-count.h # Configure with Clang arguments ./configure --prefix=/usr/local CXX="clang++ -std=c++11 -stdlib=libstdc++" make sudo make install Note: To configure OpenFST, I used: ./configure --prefix=/usr/local --enable-compact-fsts --enable-const-fsts --enable-far=yes --enable-lookahead-fsts --enable-pdt CXX="clang++ -std=c++11 -stdlib=libstdc++" Of course, your mileage may vary. But maybe someone will find that helpful. ### CyrilAllauzen - 2014-05-12 - 10:48 ### CyrilAllauzen - 2014-05-12 - 10:55 Aaron: My original message was about compiling OpenGrm NGram 1.1.x. With version 1.2.1, you shouldn't need to -stdlib=libstdc++. Have you tried compiling without it? Esther: Could you clarify which version of OpenFst and OpenGrm NGram you were using? Which version of Xcode? Also could you look at config.log and see if you're seeing anything suspicious there. Also, if you've recently installed Xcode5, you might need to accept the license: /Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild -license accept. ### AaronDunlop - 2014-05-14 - 17:59 Cyril: Compiling OpenFST without -stdlib=libstdc++ seems to work, but configuring OpenGRM without that flag fails. ./configure --prefix=/usr/local CXX="clang++ -std=c++11" ... checking for fst/fst.h... no configure: error: fst/fst.h header not found ### AliceWalton - 2014-06-13 - 00:41 I have the same problem when I try to configure Thrax. It shows configure: error: fst/fst.h header not found Using the flag CXX="clang++ -std=c++11" does not fix it. In OpenFst, if I configure with CXX="clang++ -std=c++11 -stdlib=libstdc++" as suggested by Aaron, I got a fatal error of unordered_map not found. Well I can configure and install OpenFst without this flag but I cannot configure Thrax. Thank you very much for your help. ### JatishKhanna - 2015-09-26 - 05:33 Below would be helpful for the above thread- In short: 1. Install - openfst-1.5.0 or higher version http://www.openfst.org/twiki/bin/view/FST/WebHome 2. Set the LD_LIBRARY_PATH=/usr/local/bin [all the library *.so by default to be installed ] 3. Clean up if already tried [ make clean ] 4. Try installing it back [make && make install] walah Done Log In ## Intersection gives a FATAL error ### ErikAxelson - 2014-03-07 - 04:03 Hi all, I have two transducers with no symbol tables: 0 0 9 9 0 0 0 6 6 0 0 1 8 8 0 0 1 7 7 0 0 1 3 3 0 1 1 9 9 0 1 1 6 6 0 1 2 8 8 0 2 2 9 9 0 2 2 6 6 0 2 3 4 4 0 3 3 9 9 0 3 3 6 6 0 3 3 8 8 0 3 3 7 7 0 3 3 3 3 0 3 4 5 5 0 4 4 9 9 0 4 4 6 6 0 4 5 8 8 0 5 5 9 9 0 5 5 6 6 0 5 6 8 8 0 5 6 7 7 0 5 6 3 3 0 6 6 9 9 0 6 6 6 6 0 6 0  and 0 0 5 5 0 0 0 4 4 0 0 1 3 3 0 1 2 6 6 0 2 3 5 5 0 2 3 4 4 0 2 2 3 3 0 2 4 9 9 0 2 2 8 8 0 2 2 7 7 0 3 3 5 5 0 3 3 4 4 0 3 2 3 3 0 3 2 8 8 0 3 2 7 7 0 4 4 5 5 0 4 4 4 4 0 4 5 3 3 0 5 5 5 5 0 5 5 4 4 0 5 0  When I try to intersect them (t1 is the first transducer and t2 the second):  RmEpsilon(t1); RmEpsilon(t2); ArcSort(t1, OLabelCompare<StdArc>()); ArcSort(t2, ILabelCompare<StdArc>()); EncodeMapper<StdArc> encoder(kEncodeLabels,ENCODE); EncodeFst<StdArc> enc1(*t1, &encoder); EncodeFst<StdArc> enc2(*t2, &encoder); IntersectFst<StdArc> intersect(enc1, enc2);  I get an error message "FATAL: ComposeFst: 1st argument cannot match on output labels and 2nd argument cannot match on input labels (sort?)." What am I doing wrong? ### PaulDixon - 2014-03-08 - 09:10 You need to sort the encoded fsts before the intersection. ### ErikAxelson - 2014-03-12 - 04:50 It seems that using Encode instead of EncodeFst solves the problem, because then I'm able to sort the arcs before intersection: Encode<StdArc>(t1, &encoder); Encode<StdArc>(t2, &encoder); ArcSort(t1, OLabelCompare<StdArc>()); ArcSort(t2, ILabelCompare<StdArc>()); IntersectFst<StdArc> intersect(*t1, *t2); Log In ## OpenFst website back up ### CyrilAllauzen - 2014-02-14 - 18:20 The OpenFst website was down for a few days due to some hardware issue. The site is back up but some things (such as the doxygen links) are not working properly yet. casino offers We're working on resolving the remaining issues. Log In ## Linking OpenFst ### OliverAdams - 29 Jan 2014 - 21:09 I've built OpenFst in my home directory. I'm aiming to compile the following toy code: 1 #include <fst/fstlib.h> 2 3 int main(void) { 4 fst::StdVectorFst fst; 5 } So far I've run a number of commands that are variations of the following theme: g++ main.cc -I/home/oadams/openfst-1.3.4/src/include/ -L/home/oadams/openfst-1.3.4/src/lib/.libs/libfst.so I get various undefined reference errors. I feel I should probably have append the commands with -l , but for the obvious variations I get errors along the lines of: /usr/bin/ld: cannot find -lopenfst collect2: error: ld returned 1 exit status How can I get it to compile? Any help would be greatly appreciated. Log In ## Bug in weight-class.h: ConvertKeyToSoFilename() ? ### FedericoFlego - 26 Jan 2014 - 06:54 Hello, I've got a problem running the fstprune binary with my user defined arc type. It works fine if I do something like: > fstprune myfst.fst > /dev/null but if I do > fstprune --weight=8 myfst.fst > /dev/null I get ERROR: GenericRegister::GetEntry : tropical_LT_tropical.so: cannot open shared object file: No such file or directory Segmentation fault I believe this is due to function ConvertKeyToSoFilename() in weight-class.h: virtual string ConvertKeyToSoFilename(const string &key) const { return key + ".so"; } which should probably be: virtual string ConvertKeyToSoFilename(const string& key) const { string legal_type(key); ConvertToLegalCSymbol(&legal_type); return legal_type + "-arc.so"; } Thanks a lot for your feedback! Federico ### DoganCan - 27 Jan 2014 - 03:19 Hi Federico, I believe, you need a shared object for your user defined weight type (see attached code) as well. Make sure the the DSO is named tropical_LT_tropical.so and put it somewhere dlopen can find. Also, you will get the following error with that fstprune call since 8 can not be parsed as a pair weight. fstprune --weight=8 test.fst > /dev/null FATAL: StrToWeight: Bad weight = "8", source = WeightClass, line = 0  Try this instead fstprune --weight=8,0 test.fst > /dev/null  Cheers, Dogan #include <fst/arc.h> #include <fst/script/weight-class.h> using namespace fst; namespace fst { namespace script { REGISTER_FST_WEIGHT(LexicographicWeight<TropicalWeight, TropicalWeight>); } }  ### FedericoFlego - 27 Jan 2014 - 08:40 Thanks a lot Dogan! Actually I've got the shared object (named tropical_LT_tropical-arc.so) which is loaded correctly when I do for example fstprint. I see in the code DSOs are generally loaded adding the suffix -arc: legal_type + "-arc.so" It's only in src/include/fst/script/weight-class.h that the suffix is not added. After changing function ConvertKeyToSoFilename() in weight-class.h from:  virtual string ConvertKeyToSoFilename(const string &key) const { return key + ".so"; }  to:  virtual string ConvertKeyToSoFilename(const string &key) const { //return key + ".so"; //ff257: is this a bug? string legal_type(key); ConvertToLegalCSymbol(&legal_type); return legal_type + "-arc.so"; }  it worked fine. Thanks for confirming that I need to add REGISTER_FST_WEIGHT. I didn't do it initially because it doesn't seem to be mentioned in the on-line documentation, but after a grep of the src for REGISTER_FST I had found it. Then yes, the last missing bit was to pass 8,0 rather than 0. I've just tried and it works perfectly! Thanks a lot! Federico ### FedericoFlego - 27 Jan 2014 - 08:46 Apologies for the bad formatting... And I don't see an option to fix it ### DoganCan - 28 Jan 2014 - 03:55 Hi Federico, I am glad it worked out. Regarding the change you made in the source code, I don't think it is a bug. OpenFst binaries can dynamically load user defined weight types just like user defined arc types. Even though in practice these two mean more or less the same thing, technically these are two different things. In this particular case, fstprune is looking for the user defined weight type (not the arc type) to parse the command line weight parameter. That is why you don't see the same error when you call fstprint (or when you call fstprune without the --weight option) which only needs to load the user defined arc type. I believe the intended way of handling this issue is by creating another DSO named tropical_LT_tropical.so (in addition to tropical_LT_tropical-arc.so) which only registers the weight type. Patching the library code is not a great idea if you plan to update the OpenFst library later when a new version comes out or if you plan to run your code on different machines -- you will have to remember patching the library -- or if you plan to distribute your code. Cheers, Dogan Log In ## Is there a general approach to eliminating redundant (higher-weighted) paths in cyclic WFSAs? ### AlecS - 16 Jan 2014 - 05:36 Given a simple tropical semiring WFSA, defined by the expression A* | (A / 1)* | (B /2) Is there any general way to reduce this to the semantically equivalent WFSA A* | (B /2) In other words, how to remove redundant higher-weighted paths in tropical semiring WFSAs with weighted cycles? Does anyone have a solution to this? Thanks. ### DoganCan - 18 Jan 2014 - 21:12 Hi Alec, I don't think there is a general solution to your problem provided by the public release of OpenFst at this time. Determinization will do what you want as long as the input automaton has the twins property. The example you gave does not have this property, hence determinization will not work in this case. Having said that, conclusion section of this paper alludes to a general disambiguation algorithm for weighted automata which does not require the twins property to hold. I believe there is a good chance that algorithm will be a part of the public release of OpenFst once it is published, but this is pure speculation on my end. Cheers, Dogan ### AlecS - 04 Feb 2014 - 05:26 Hi Dogan, Thanks for your input. Yes I agree, if someone could devise a working disambiguation algorithm for weighted FSAs, that would do the trick. Also, a determinization algorithm that would determinize all determinizable WFSAs, despite the lack of twins property in certain cases (like the example above), would suffice in my case. Let’s hope your speculation is correct. Thanks, Alec ### PaulDixon - 07 Feb 2014 - 01:01 mifst toolkit has a determinization algorithm that tries to detect self loops with different weights. Log In ## Infinity weight in compiled fst ### HuanliangWang - 30 Dec 2013 - 23:45 I get Infinity weight in final state when I compile a text fsm by the command "fstcompile --arc_type=standard --fst_type=const infsm | fstprint > outfsm". The file "infsm" is very large, about 20G. I find a line "3 Infinity" in the file "outfsm". Is this an error? How can I solve it? Log In ## Why do I get Infinity weight in final state? ### HuanliangWang - 30 Dec 2013 - 23:36 I get a Infinity weight in final state when I compile a text fst into binary format by the command "fstcompile --arc_type=standard --fst_type=const infsm > outfst". The "infsm" file is large fsm. Then I convert the "outfst" into text format by command "fstprint". I find there is a line "3 Infinity" in the converted fsm file. Is the case a error? How can I solve it? ### PaulDixon - 31 Dec 2013 - 00:41 What does fstinfo report? Sounds like you have coaccessible states (states not reachable from a final). If the machine is connected the number of accessible and coaccessible states reported by fstinfo should be the same. You can remove the dead end states with fstconnect. ### HuanliangWang - 01 Jan 2014 - 21:45 Hi, Paul, I get the fst information by "fstinfo". It is as following: fst type const arc type standard input symbol table none output symbol table none # of states 425041220 # of arcs 1230392842 initial state 0 # of final states 0 # of input/output epsilons 425041217 # of input epsilons 425041217 # of output epsilons 425041217 # of accessible states 425041220 # of coaccessible states 0 # of connected states 0 # of connected components 1 # of strongly conn components 43672597 The number of accessible and coaccessible states is not same. ### HuanliangWang - 01 Jan 2014 - 21:51 sorry! It is like as: fst type const arc type standard input symbol table none output symbol table none # of states 425041220 # of arcs 1230392842 initial state 0 # of final states 0 # of input/output epsilons 425041217 # of input epsilons 425041217 # of output epsilons 425041217 # of accessible states 425041220 # of coaccessible states 0 # of connected states 0 # of connected components 1 # of strongly conn components 43672597 ### PaulDixon - 02 Jan 2014 - 01:50 Your machine doesn't have final states at at all and is probabaly useless. You could try generating a tiny FST with the same procedure and viewing it using fstdraw to check the structure is what you expect. ### HuanliangWang - 06 Jan 2014 - 06:50 thank you! I find there are two end states in input fsm "infsm". It is unexpected. I will check the error firstly. Log In ## error in ubuntu 13.10 ### YanWang - 19 Dec 2013 - 22:02 ../script/.libs/libfstscript.so: undefined reference to dlopen' ../script/.libs/libfstscript.so: undefined reference to dlerror' collect2: ld returned 1 exit status the openfst-1.3.4 compile error in ubuntu.How can I solve it? ### PaulDixon - 20 Dec 2013 - 04:44 Seems the -ldl linker switch is missing or need the order changing. This has been discussed in earlier posts. ### FalouuFalouu - 20 Jan 2014 - 06:10 I found a solution (Ubuntu 13.10): use gcc-4.4 and g++-4.4 instead of 4.8. 1. sudo apt-get install gcc-4.4 g++-4.4 2. Temporarily link gcc to gcc-4.4 (and g++ ...) or replace all occurences of gcc and g++ in Makefiles to 4.4 version. Maybe it is not all necessary, but I did it and it worked. ### DavidLandan - 22 Jan 2014 - 15:26 I'm seeing the error as well, and various orderings of the -lm & -ldl tags don't seem to change things. Unfortunately, while using gcc/g++ v4.4 is a workaround, you may run into issues later on with other tools that make use of the openfst library (as I discovered). My src/bin/Makefile.am only has one line with "LD" in it: LDADD = ../script/libfstscript.la ../lib/libfst.la -lm -ldl putting the -lm & -ldl switches in the middle or end doesn't matter: libtool: link: g++ -O2 -o .libs/fstarcsort fstarcsort.o -lm -ldl ../script/.libs/libfstscript.so ../lib/.libs/libfst.so ../script/.libs/libfstscript.so: undefined reference to dlopen' ../script/.libs/libfstscript.so: undefined reference to dlerror' collect2: error: ld returned 1 exit status libtool: link: g++ -O2 -o .libs/fstarcsort fstarcsort.o ../script/.libs/libfstscript.so ../lib/.libs/libfst.so -lm -ldl ../script/.libs/libfstscript.so: undefined reference to dlopen' ../script/.libs/libfstscript.so: undefined reference to dlerror' collect2: error: ld returned 1 exit status Has anyone else found a way to work through this using gcc/++ tools of v4.7 or later? ### YendaTrmal - 31 Jan 2014 - 19:35 Ok, I was able to figure it out, perhaps... Not really sure if that is the correct solution either. I don't really have experience with autotools and such, so I had to mess with everything and I'm not sure in which sequence the things have to done, but everything boils down to changing the Makefile.am files containing the LDADD to -LDADD = ../script/libfstscript.la ../lib/libfst.la -lm -ldl +LDADD = ../script/libfstscript.la ../lib/libfst.la +AM_LDFLAGS = -Wl,-no-as-needed -lm -ldl ### YendaTrmal - 31 Jan 2014 - 19:36 Sorry, wrong formatting -LDADD = ../script/libfstscript.la ../lib/libfst.la -lm -ldl +LDADD = ../script/libfstscript.la ../lib/libfst.la +AM_LDFLAGS = -Wl,-no-as-needed -lm -ldl ### StevenBedrick - 2014-02-16 - 16:12 First, I think there's a slight formatting bug above- from what I can tell, "no-as-needed" is a two-dash argument (i.e., it should be "-Wl,--no-as-needed" (note the two dashes before no-as-needed)). Second, editing the relevant Makefile.ac files and making this change does in fact result in a successful compilation on modern Ubuntu/GCC combinations, but the resulting binaries ("fstinfo", etc.) still aren't being linked properly against libfst and libfstscript. So, the result of running "make install" is a bunch of broken binaries. I've poked at it for a while and not had any luck fixing the problem; somebody with stronger automaker-fu than myself might have more luck. ### StevenBedrick - 2014-02-17 - 16:01 OK, never mind- I was correct that the issue was with my automake-fu. The problem was twofold: first, I wasn't actually telling automake to regenerate the Makefile.in files from the modified Makefile.am files, and second, once I figured out that that's why my changes to the Makefile.am files weren't having an effect, I next discovered that modern versions of Ubuntu come with a different version of automake/autoconf than the OpenFST distribution was created with. So, to get OpenFST built and installed correctly, here's what you'll need to do: First you have to modify the relevant Makefile.am files as described above by YendaTrmal (taking into account the typo I identified in my previous comment vis-a-vis double-dashed arguments). Second, you need to re-generate all of the automake-related files. I did this by running "autoreconf --force --install", which might be more than was needed- see my earlier comments about my lack of automake-fu. Before you can do this, however, you'll need to modify configure.ac in the top-level distribution directory; the AM_INIT_AUTOMAKE directive has a flag that treats all warnings as errors, and the differences in the automake/autoconf versions causes several warnings that will prevent autoreconf from running correctly. The "solution" was to get rid of the "-Werror" flag in the AM_INIT_AUTOMAKE directive. None of the errors seemed particularly dire, but take that observation with a grain of salt (see earlier comments about my lack of automake-fu). Removing "-Werror" caused all the relevant Makefile.in files to be properly regenerated, with the corrected LDADD/AM_LDFLAGS variables. Third, "./configure"/"make"/"make install" as normal. That seems to do the trick- the resulting binaries are all properly linked now. ### StevenBedrick - 2014-02-17 - 19:55 One more update- apologies to YendaTrmal, "no-as-needed" seems to work just fine as a single-dashed argument. I must have had some other typo going on when I was having trouble with it earlier. disregard that part of my comment. ### MathieuMorey - 2014-02-25 - 11:53 The solutions above did not really work for me, but they were a great starting point (thanks a lot, YendaTrmal and StevenBedrick !). I managed to get openFST to compile, install and work as a shared library on Ubuntu 13.10 using the following steps: 1. in "src/lib/Makefile.am", append: libfst_la_LIBADD = -ldl 2. in "configure.ac", replace: AM_INIT_AUTOMAKE([foreign nostdinc -Wall -Werror]) with: AM_INIT_AUTOMAKE([foreign nostdinc -Wall -Wno-extra-portability -Werror])" 3. autoreconf --force --install then the usual ./configure && make && make install . You can check it works by running "make check". Log In ## fstequal command ### LahiruSamarakoon - 19 Dec 2013 - 07:30 Hi all, What is the current status of the "fstequal" command? Is it working properly. My attempts to use it was unsuccessful. Why this command is not listed in the website? Thanks ### PaulDixon - 20 Dec 2013 - 04:34 It goes through two machines and does a state by state and and an arc comparison, setting --v to 1 should give detailed output. There seems to be typo in line 97 of the code ### FalouuFalouu - 20 Jan 2014 - 06:09 I found a solution (Ubuntu 13.10): use gcc-4.4 and g++-4.4 instead of 4.8. 1. sudo apt-get install gcc-4.4 g++-4.4 2. Temporarily link gcc to gcc-4.4 (and g++ ...) or replace all occurences of gcc and g++ in Makefiles to 4.4 version. Maybe it is not all necessary, but I did it and it worked. ### FalouuFalouu - 20 Jan 2014 - 06:12 Sorry, i accidentally put response in wrong place. Please delete. Log In ## Visual Studio OpenFST 1.3.4 ### RolandSchwarz - 02 Dec 2013 - 05:03 Does anyone have a working solution for compiling OpenFST 1.3.4 on Visual Studio? I found this project here but the latest version there is 1.3.1. Any help would be appreciated. Thanks, Roland Log In ## Compiling errors on Mac Mavericks 10.9 ### CesineC - 22 Nov 2013 - 11:23 Has anyone gotten OpenFST to compile on OS 10.9? Here are the errors I'm getting: $ ./configure --prefix=$OPENFST_HOME --enable-far --enable-pdt && make clean make && make install || { echo "Error building openfst."; exit 1; } .... In file included from fst.cc:21: In file included from ./../include/fst/fst.h:34: In file included from ./../include/fst/arc.h:28: In file included from ./../include/fst/expectation-weight.h:36: In file included from ./../include/fst/pair-weight.h:29: In file included from ./../include/fst/weight.h:82: ./../include/fst/util.h:24:10: fatal error: 'tr1/unordered_map' file not found #include <tr1/unordered_map> ^ 1 error generated. make[3]: *** [fst.lo] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 Error building openfst.$ g++ --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.0.2



### KyleGorman - 22 Nov 2013 - 13:25

I have.

My understanding is that tr1 isn't part of any standard, and Apple moved it to a place where autoconf isn't looking for it. You need to set CPPFLAGS to point to that header and LIBS to point to that shared library. See:

### CesineC - 26 Nov 2013 - 14:15

Hi Kyle, I actually found your brew example which helped me get it to work, thats why i posted on your PR for brew also so that maybe they would accept that the current brew formula wasn't working, and that we might have to use libstdc++...

I was hoping that we weren't the only ones using mavericks, but it looks like we are.

For those who run into this later, here is some bash which (so far) appears to work well on mavericks:


./configure --prefix=\$OPENFST_HOME  --enable-compact-fsts --enable-const-fsts --enable-far=yes --enable-lookahead-fsts --enable-pdt CXX="clang++ -std=c++11 -stdlib=libstdc++" &&
make clean
make &&
make install || {
echo "Error building openfst.";
exit 1;
}


## Availability of linear classifier/tagger FST extension

### DavidHugginsDaines - 19 Nov 2013 - 13:25

I notice that there is a section on "Linear FSTs" in the FstExtensions page, however in the 1.3.4 release the referenced header files and fstlinear program aren't present. Is this available elsewhere, or likely to be released soon?

Thanks!

## How to compile on 64 bit machine

### ErrorNotFound - 22 Oct 2013 - 22:34

There seems to be only one OpenFST package for download. In other places I have seen separate packages for 32 bit and 64 bit. If I have a 64 bit machine, how do I install OpenFST there?

Thanks a lot.

## kPosInfinity in old version of fst

### ScottLawton - 11 Oct 2013 - 19:32

I'm trying to compile some code (http://bitbucket.org/yotaro/srtk) that apparently depends on an older (unstated) version of fst. It references kPosInfinity:

return LogLatticeWeight(fst::FloatLimits::kPosInfinity, fst::FloatLimits::kPosInfinity);

But, openfst-1.3.3/src/include/fst/float-weight.h just defines PosInfinity (no 'k' prefix).

I tried a superficial fix: deleting the 'k' in srtk -- but that yields "no matching function for call to ‘srtk::LogLatticeWeight::LogLatticeWeight(const float (&)(), const float (&)())’".

I'm a Python guy not a C++ guy ... any suggestions? I assume there's a way to cast it to (float, float), or a wrapper of some sort.

Thanks.

### PaulDixon - 12 Oct 2013 - 01:58

You need to add the template parameter and parentheses

fst::FloatLimits<float>::PosInfinity()


## Probably, bug: arc-map.h, ArcMap(), case MAP_REQUIRE_SUPERFINAL

### NickolayMerkin - 29 Aug 2013 - 05:01

I've done a static analysis of source code with PVS-Studio, and it has reported a probable bug.

V640 The code's operational logic does not correspond with its formatting. The statement is indented to the right, but it is always executed. It is possible that curly brackets are missing. arc-map.h 178

if (final_arc.ilabel = 0 || final_arc.olabel = 0 || final_arc.weight = Weight::Zero()) fst->AddArc(s, A(final_arc.ilabel, final_arc.olabel, final_arc.weight, superfinal)); fst->SetFinal(s, Weight::Zero());

fst-AddArc is conditional (then-branch of the preceded if) fst-SetFinal is unconditional, though it is indented as if it is conditional too.

## Optimization issue: functions Times and Divide on logartithmic and tropical weights

### NickolayMerkin - 29 Aug 2013 - 04:54

Hello, I've run my application (LV CSR decoder) that uses OTF WFST composition under the AMD profiler and found that much time is spent in two functions: inline TropicalWeightTpl Times and Divide. They do costy checks for NaN and +/-Inf before the arithmetic operation (+ for times, - for divide). But it's obvious that nan+anything is nan, inf+anything is inf, and inf-inf is nan. Thus, these pre-checks are redundant.

When I've eliminated these checks, the speed of my application has been increased for 5%.

### PaulDixon - 29 Aug 2013 - 22:32

Just out of interest are using the TropicalWeight type every where in your application? Or just as part of the composition and ArcIterator interface and then float and + operator everywhere else? Thanks.

### AlexZatv - 11 Sep 2013 - 15:15

It's almost everywhere in application. Looks like most decoders use log scores, but we with Nickolay use weights.

## Which optimization is applied to OpenFst 1.3.3 ?

### JihoonKim - 18 Aug 2013 - 21:28

Hi,

OpenFst 1.3.3 is much faster than OpenFst 1.3.1. While fstdeterminize of 1.3.1 takes about 1 hour, that of 1.3.3 takes only less than 1 minute! How can 1.3.3 be this faster? Could anyone explain the reason?

Thanks.

### AlexZatv - 19 Aug 2013 - 09:56

Hello! Are you talking about Windows version, or linux? There was some problems in windows standard library (std::hash_map is too slow,boost::hash_map is much faster)

### JihoonKim - 20 Aug 2013 - 04:16

Yes, I meant Windows version. It's because of windows standard library ! Thanks for the quick response.

## Can I get OpenFst 1.3.3 Visual Studio Project

### JihoonKim - 15 Aug 2013 - 04:47

Hi,

I have to build OpenFst in Windows. In the download page, Visual Studio project is just for 1.1 and 1.3.1. It seems that latest version 1.3.3 is much faster than 1.3.1 so I need 1.3.3 to be built in Windows.

Is there anyone who has 1.3.3 for windows ?

### PaulDixon - 15 Aug 2013 - 19:14

https://dl.dropbox.com/u/321851/openfst-1.3.3.zip

The memory map FSTs compile but aren't tested. The ngram FSTs aren't finished that why I never uploaded to the contrib section. Also includes VS2012 solutions.

### JihoonKim - 16 Aug 2013 - 01:15

Thanks for your quick response. It'd be great help for me.

## Merging symbol tables

### PrashantMathur - 26 Jul 2013 - 10:28

Hi,

I am a newbie to openfst, so this can sound stupid.

Is there a function which can merge two SymbolTables? I was using A->AddTable(SymbolTable* B) , where I want to merge B in A. function to merge two symbol tables. But I believe AddTable function just appends table B with A and starts indexing with the maximum key. ( I don't know if I am correct).

A 0 a 1 b 2 c 3

B 0 d 1 a 2 b 3

C - Resulting Table 0 a 1 b 2 c 3 d 4

My code needs to do this operation several times, so I thought I better ask first.

Thanks!

### PrashantMathur - 26 Jul 2013 - 10:30

sorry the formatting didn't turn out as I planned to .. let me try again  A 0 a 1 b 2 c 3 B 0 d 1 a 2 b 3 

 C 0 a 1 b 2 c 3 d 4 

### PrashantMathur - 26 Jul 2013 - 11:10

Is there a function which can merge two SymbolTables?
I was using  A->AddTable(SymbolTable* B) , where I want to merge B in A.
But I believe AddTable function just appends table B with A and starts indexing with the maximum key. ( I don't know if I am correct).
A
eps 0
a 1
b 2
c 3

B
eps 0
d 1
a 2
b 3

C - Resulting Table
eps 0
a 1
b 2
c 3
d 4

My code needs to do this operation several times, so I thought I better ask first.
Thanks!

### PaulDixon - 27 Jul 2013 - 05:15

If table A has holes, you can use CompactSymbols after merging the table to give a contiguous table. MergeSymbolTable will merge symbol tables and has optimizations for certain cases. Both are in symbol-table-ops.h

## I/O error for large FSTs

### SamS - 08 Jul 2013 - 14:49

I'm working on an application where I'd like to store a large FST in binary format, but I'm getting read/write errors. I'm convinced my large FST is well formed and that there might be a problem with big read/write operations. Here's a toy example that produces the same error:

   SymbolTable * syms = new SymbolTable;

StdVectorFst res;
for (int s = 0; s < big_number; s++) {
TropicalWeight final = -log(0.5);
res.SetFinal(s, final);
for (int dest = 0; dest < big_number; dest++) {
}
}

res.SetStart(0);
res.SetInputSymbols(syms);
res.SetOutputSymbols(syms);

return 0;


For me, the big number seems to be too big at around 923. The printed error message is:

ERROR: VectorFst::Read: read failed: badfile.fst


Any insight into what's going on and how to get around this issue would be appreciated. Thanks!

### SamS - 08 Jul 2013 - 16:39

And this turned out to be a problem with the local file system. Sorry!

### MikySu - 12 Jul 2013 - 12:54

I also had the same error message. i command "fstcompile --arc_type=log --fst_type=const GramFst > GramFst.out.bfsm",but the fst type of GramFst.out.bfsm didn't transform from vector to const. could someone give me some advice?

### MikySu - 12 Jul 2013 - 13:16

my file GramFst.out.bfsm and LexFst.out.bfsm both of fst type were vector. i commanded "fstcompose LexFst.out.bfsm GramFst.out.bfsm >a1.bfsm". i checked the information with fstinfo and then the error message came out "ERROR: VectorFst::Read: read failed:a1.bfsm". file GramFst.out.bfsm and LexFst.out.bfsm were fine.

### PaulDixon - 13 Jul 2013 - 04:23

What is the format of GramFst? Hoe many states and arcs do the bfsm machine have? Did this work for you fstconvert --fst_type=const < some.vector.fst > new.const.fst

## Making an FST with C++: How to define strings as labels?

### DennisRitter - 19 Jun 2013 - 10:30

In the shell you have the possibility to set up strings as ilabels and olabels (with isymbols.txt and osymbols.txt) But when I followed the QuickTour and wanted to do it all in the C++-Editor I couldn't find out, how to give the strings to an vectorfst. I can only give them integers as labels, because of the definition of stdarc.

Would be glad if someone could point out that to me.

PS Is it even effective to make it all in the C++-editor (I am using Eclipse)?

### PaulDixon - 20 Jun 2013 - 02:11

Use the SymbolTable to assign/get integer ids for the string labels

### DennisRitter - 20 Jun 2013 - 10:26

Is that the way you meant it?
SymbolTable* isyms;
isyms = new SymbolTable("isyms.txt");
SymbolTable* osyms;
osyms = new SymbolTable("osyms.txt");

fst.SetInputSymbols(isyms);
fst.SetOutputSymbols(osyms);

// Adds state 0 to the initially empty FST and make it the start state.
fst.SetStart(0); // arg is state ID

// Adds two arcs exiting state 0.
// Arc constructor args: ilabel, olabel, weight, dest state ID.
fst.AddArc(0, StdArc(isyms->Find("a"), osyms->Find("x"), 0.5, 1)); // 1st arg is src state ID

// Adds state 1 and its arc.

// Adds state 2 and set its final weight.
fst.SetFinal(2, 3.5); // 1st arg is state ID, 2nd arg weight

fst.Write("example.fst");

### PaulDixon - 20 Jun 2013 - 19:59

Yes, by convention symbol 0 is the for the epsilon label so call oyms->AddSymbol("\<eps\>"); first and make sure the return value is 0.

AddSymbol returns the id if the key already exists. So you can remove all the AddSymbol lines (except for the epsilon) and do this instead

From memory in this case I think you would need to add the Symbol tables after adding all the arcs. Internally I think it will make a copy when you use SetInputSymbols method so might also need to delete the SymbolTables also.

### DennisRitter - 21 Jun 2013 - 05:01

Thanks a lot Paul!

PS You now helped me the third time, first time as Edobashira on your old blog :). I was DreeDrunk at that time. Shall I mark my thread with something like a SOLVED-tag?

### PaulDixon - 23 Jun 2013 - 16:50

Glad I could help. Not sure if the thread should be marked as solved.

## GenericRegister::GetEntry : lookup failed in shared object

### MarkusD - 12 May 2013 - 20:01

I'm trying to register my expectation arc, so that it can be used in fstcompile, etc.

But I'm getting the runtime error:

ERROR: GenericRegister::GetEntry : lookup failed in shared object: expectation-arc.so FATAL: No operation found for "CompileFst" on arc type expectation

How can I fix this?

The register call in my code looks like this (my arc is called MDExpectationArc):

#include <fst/const-fst.h> #include <fst/edit-fst.h> #include <fst/vector-fst.h> #include <fst/script/register.h> #include <fst/script/fstscript.h> namespace fst { namespace script { REGISTER_FST(VectorFst, MDExpectationArc); REGISTER_FST(ConstFst, MDExpectationArc); REGISTER_FST(EditFst, MDExpectationArc); REGISTER_FST_CLASSES(MDExpectationArc); REGISTER_FST_OPERATIONS(MDExpectationArc); }}

Here is how I compile: /usr/bin/c++ -fPIC -O2 -shared -o expectation-arc.so expectation-arc-inline.cc.o /c01_data/mdreyer/software/openfst-1.3.3/lib/libfst.so

or this, gives the same error: /usr/bin/c++ -fPIC -O2 -shared -Wl,-soname,expectation-arc.so -o expectation-arc.so expectation-arc-inline.cc.o openfst-1.3.3/lib/libfst.so -ldl -Wl,-rpath,openfst-1.3.3/lib

It used to work for me with an old openfst version, but I'm getting this error now with openfst 1.3.3.

### MarkusD - 12 May 2013 - 20:07

For clarity, I'm getting the runtime error when I run this, for example:

fstcompile --arc-type=expectation

where it loads the expectation-arc.so file that I created using the c++ commands above.

### PaulDixon - 13 May 2013 - 08:10

Does the underlying weight type of the MDExpectationArc return the string "expectation" from the Type() method? I think the Type() method returns value need to match the name of shared object.

## how to use fstreplace tool

### HuanliangWang - 09 May 2013 - 21:54

Hi, I want to use fstreplace tool to generate a extended fst by replacing a nonternimal label in root fst by a subfst. But I got error when using the tool. I know it is necessary to preprocess two fsts, such as symbol table. But I don't know the detail. Could anybody tell me the detailed usage procedure?

Thanks,

Huanliang Wang

## Regex to FST

### RaihanRasool - 07 May 2013 - 12:56

Hi,

Using OpenFst library we can create FST for a given regular expression. But is there any way I could develop a generic RegEx to FST converter that takes 'any' regular expression and build FST ?

### PaulDixon - 09 May 2013 - 00:09

Maybe the answer to this question is what you are looking for

### MarkusD - 12 May 2013 - 19:55

Also, I think Thrax does something like that: http://www.openfst.org/twiki/bin/view/GRM/Thrax

### KennethRBeesley - 2017-01-26 - 22:58

I would also suggest taking a look at the Kleene language: www.kleene-lang.org

## Coordinate Descent Solution for Rational Kernels

### AbielRoche - 01 May 2013 - 13:59

Is there any available implementation of Coordinate Descent Solution for SVM using Rational Kernels? As it is mentioned in the paper: "Large-Scale Training of SVMs with Automata Kernels". Thank you for your help,

## “FATAL: StringWeight::Plus: unequal arguments” when running fstdeterminize.

### JiaP - 31 Mar 2013 - 21:51

The transducer in question simply describes a flat list dictionary without any explicit weight. It is very similar to the one given in the “Tokenization” example, except being much larger (about 664,000 states).

When I ran fstdeterminize on this transducer, the command line tool exited with following error message: FATAL: StringWeight::Plus: unequal arguments (non-functional FST?) w1 = 1 w2 = 2

I will try to construct such transducer directly using C++ API. Mean while, I’d appreciate any suggestion.

Jia

### PaulDixon - 01 Apr 2013 - 01:59

Are you adding disambiguation symbols like the ones described for the lexicon construction in this paper http://www.cs.nyu.edu/~mohri/pub/hbka.pdf

### EricLebigot - 03 Apr 2013 - 22:37

@PaulDixon: I had a look at the paper you link to. As far as I understand, for determinization to be possible, each input string should only produce a single output string, right?

## Composition result too large: how to make smaller?

### EricLebigot - 25 Mar 2013 - 22:27

Hi,

I am trying to calculate a kernel of the form T∘T^-1 from a transducer T, but the calculation uses up a lot of memory and I have to stop it before it converges (I stopped at 25 GB of swap). What could I try to do in order to calculate the composition result? (My ultimate goal is to calculate a kernel between acceptors A and B through the A∘T∘T^-1∘B composition and getting the (log) total weight between the starting and final states. A difficulty is that A∘T can produce a huge number of paths, and I was hoping that calculating T∘T^-1 could help solve this difficulty.)

I tried to first optimize transducer T, but this failed (a first epsilon removal yields a fatal error about StringWeight::Plus being given unequal arguments, as does determinization).

Here are some characteristics of transducer T, if they matter:

- About 15 nodes and 200k arcs.

- Accommodates input strings of any size through the use of internal loops.

If necessary, I can remove all loops, or some loops (because I know in advance how big the input strings can be); this would multiply the number of nodes and arcs by maybe 100 or 600 (depending on which loops I remove). Could removing loops significantly with the "too much memory used for composition" problem?

I could also limit the number of arcs, but this would make the transducer less expressive, so I would like to avoid this. However, if this is the best way of speeding up the calculation of T∘T^-1 and limiting its memory requirement, I am ready to do that.

What would you suggest, so that T∘T^-1 can really be calculated?

### PaulDixon - 26 Mar 2013 - 05:35

In the transducers do you have to traverse many epsilon transitions before reaching a label? This can delay the matching and lead to the generation of many dead end states. These states are normally removed after composition.

I don't think epsilon removal uses the StringWeight are your weights in the log/tropical semiring? Another problem could be negative weighted epsilon cycles.

### EricLebigot - 26 Mar 2013 - 10:02

(I'm not fully familiar with FST, so please let me know if I'm not clear.)

Transducer T can produce many epsilons (in about half of the 200k arcs) before reaching a final state (and T^-1 therefore can consume many epsilons). This produces a huge number of possible outputs, and I was hoping that OpenFST could somehow "magically" calculate T∘T^-1 despite this. This epsilon production cannot be removed, because it is one of the main purposes of transducer T to remove parts of the input strings.

The weights are in the log semiring. There are cycles that produce epsilon (in T), with a weight of LogWeight(0.69…) (corresponding to 0.5 in real space).

With these precisions, what do you think is the best way to speed up the composition? reducing the number of arcs (I can do this by reducing the number of recognized "letters")?

### PaulDixon - 01 Apr 2013 - 07:10

Do have an example transducer you can share online?

### EricLebigot - 01 Apr 2013 - 23:56

@PaulDixon: I put a simplified version at http://www.normalesup.org/~lebigot/special/transducer.pdf. The real version has many more arcs (20k arcs where there are ~40 between two states in the simplified version, 1k arcs where there are ~4 arcs). Note: I will have a look at the disambiguation paper that you mentioned yesterday: maybe determinization will speed up the calculations?

### DoganCan - 02 Apr 2013 - 06:09

Hi Eric,

The problem is that transducer T can produce infinitely many consecutive epsilons (when it is not restricted by input A) and similarly transducer T^-1 can consume infinitely many consecutive epsilons (when it is not restricted by output B). Naturally, composition does not terminate. You will have to avoid "all" epsilon loops on the output side of T if you want to compute ToT^-1, i.e. limiting the number of output-epsilon arcs won't help. Unfortunately there is no such magic hidden in openfst

Cheers, Dogan

### EricLebigot - 02 Apr 2013 - 09:32

@DoganCan: Thank you for your input: very interesting! I can certainly remove infinite epsilon-generating loops (thanks to the structure of the input strings). However, a strange thing is that T∘T^-1 does terminate, with OpenFST (when I restrict the number of arcs, like in the simplified version—the change is only quantitative, not qualitative). Does this mean that the composition obtained by OpenFST is likely incorrect? I am a little confused.

### EricLebigot - 02 Apr 2013 - 09:46

PS: I put the transducer from the PDF above in files transducer.fst, transducer.ssym (state symbols) and transducer.sym (labels) at http://www.normalesup.org/~lebigot/special/, if this can help. OpenFST happily calculates T∘T^-1 after arc-sorting the outputs of T.

### DoganCan - 03 Apr 2013 - 04:19

Hi Eric,

It looks like my initial assessment was incorrect. I thought composition would somehow force enumeration of all possible matches between the epsilon cycles on the output side of T and the epsilon cycles on the input of T^-1. Instead, what happens is composition simply creates paths which first generate possibly infinitely many epsilons and then consume possibly infinitely many epsilons. The composition output for the small transducer is correct as far as I can tell.

I think the blow-up in composition is caused by the self-loops on states A_CAT_V_KEPT and B_CAT_V_KEPT. More explicitly, since T maps each input symbol on these arcs to "Wild" and T^-1 maps "Wild" back to these symbols, composition output has a mapping between each input symbol pair. If the big transducer is following the same pattern, there will be around 10K x 10K = 100M self loops in the output of composition for each of these two states.

Cheers, Dogan

### EricLebigot - 03 Apr 2013 - 05:31

Hi Dogan,

Thank you for following up! Your remarks are very useful. The structure of the input data is such that the Wild labels can be specialized, so that the number of possible matches is reduced (PCP_123:Wild is now PCP_123:PCP_Wild, etc., so that only similar categories match through Wild). The result is that calculating kernel elements is much faster, thank you!

Maybe will the calculation of T∘T^-1 require less memory, too…

Best, EOL

## Double delete bug in fst far reader

### DoganCan - 05 Mar 2013 - 04:14

Hi,

I came across a double delete bug in the implementation of FstFarReader class. ReadFst method deletes the current fst before reading a new one, however if there is no more fst to read then the current fst pointer is left dangling. That leads to a segfault when the class destructor is called. You might trigger the error with

farequal fst1 fst2


I fixed the problem by moving the pointer deletion after the check for whether there is more fst to read.

This error is not triggered by other far utilities reading from an fst archive (farprintstrings, farinfo and farextract) since these tools, unlike farequal, do not delete the far reader pointer after they are done with it. It might be a good idea to fix that as well.

Cheers, Dogan

## Get the Arcs whose nextstate is equal to a given q

### JuanMiguelCejuela - 04 Mar 2013 - 08:31

Is it possible to efficiently get all the arcs which have as nextstate a given state q?

### DoganCan - 05 Mar 2013 - 04:50

I don't think there is builtin functionality for efficient retrieval of incoming arcs. Your best option is to implement a new Fst class similar to VectorFst which has this functionality. You will first need to modify the definition of VectorState and add a new vector of arcs for incoming arcs. Then you will need to modify the methods that has to do with arcs such as AddArc, RemoveArcs, DeleteArcs, etc. to handle the incoming arcs as well as outgoing arcs. Finally you will need to define a new arc iterator for visiting incoming arcs.

Cheers, Dogan

### AlexZatv - 05 Mar 2013 - 07:10

May be, you can revert the Fst? If you will revert the fst but keep the state id's the same, you can simply get revert(F).out_arcs(q).

## Testing an FST for epsilon cycles

### KennethRBeesley - 24 Jan 2013 - 00:41

If I have an FST, how can I best test to see if it has input epsilon cycles?

### KennethRBeesley - 22 Feb 2013 - 12:16

Here's my own best effort, which seems to be working. Comments welcome. C++ is definitely not my native language.

bool isInputEpsilonCycleFree(StdVectorFst * fstp) {

set<StateId> stateSet ;   // reuse this set of "visited states"
// if the machine has no cycles at all, then it's epsilon-cycle-free
if (fstp->Properties(kAcyclic, true))
return true ;

// loop through the states, looking for input epsilon cycles
for (StateIterator<StdVectorFst> siter(*fstp) ;
!siter.Done() ;
siter.Next()) {
stateSet.clear() ;
if (findInputEpsilonCycle(fstp, siter.Value(), stateSet))
return false ;
}
// no input epsilon cycles found
return true ;
}

bool findInputEpsilonCycle(StdVectorFst *fstp, StateId s, set<StateId> &stateSet) {
// try to find an input epsilon cycle from state s back
// to itself, or any other input epsilon cycle found during the search

// stateSet is the set of states previously visited during the search from s
if (stateSet.find(s) != stateSet.end())
// found a loop
return true ;
// remember state s as "visited"
stateSet.insert(s) ;
for (ArcIterator<StdVectorFst> aiter(*fstp, s) ;
!aiter.Done() ;
aiter.Next()) {
StdArc arc = aiter.Value() ;
// deal only with arcs with epsilon on the input side
if (arc.ilabel != 0)     // 0 (zero) is wired in as epsilon in OpenFst
continue ;
// here the ilabel is 0 (epsilon)
// recursively look for an input epsilon loop from arc.nextstate
if (findInputEpsilonCycle(fstp, arc.nextstate, stateSet))
return true ;
}
return false ;
}


### PaulDixon - 23 Feb 2013 - 05:48

Using the OpenFst visitor to find a topological order on the epsilon input subgraph should work.

template<class Arc>
bool HasInputEpsilonCycle(const Fst<Arc>& fst) {
vector<typename Arc::StateId> order;
bool acyclic = false;
TopOrderVisitor<Arc> top_order_visitor(&order, &acyclic);
DfsVisit(fst, &top_order_visitor, InputEpsilonArcFilter<Arc>());
return !acyclic;
}


### KennethRBeesley - 21 Mar 2013 - 15:30

Paul, Many thanks. Very elegant and seems to work perfectly.

## Example of k-closed semiring that is non-idempotent

### KonstantinSelivanov - 10 Jan 2013 - 04:30

Hi,

could you give an example k-close semiring that is not idempotent.

Thanks!

### CyrilAllauzen - 18 Jan 2013 - 20:46

Hi, The k -tropical semiring defined by (Mohri, 2002) is (k-1) -closed and not idempotent when k ≥ 2 .