OpenFst Forum 2018 Archive

Unknown FST type "vector" (arc type = "standard")

BruceLin - 2018-12-25 - 01:43

when I run fstarcsort --sort_type=olabel my_in.fst > my_out.fst, I got the the following error

MutableFst::Read: Unknown FST type "vector" (arc type = "standard")

My environment is cygwin and openfst-1.6.7. I can fix where i got wrong, can some one help?

Log In

pair weights in weight

AliiiiRezaaa - 2018-11-04 - 08:08

Hi all I want to compile fst that have multiple weight and draw it. how can I do that? for example my fst.txt is: 0 1 311 0 0,20,51 0 2 361 0 0,51,68 0 247 0 1 -1.09863,51,150 0 3 864 0 0,71,91 0 4 885 0 0,94,118 0 5 545 0 0,118,150

but it error: fstcompile fst.txt FATAL: FstCompiler: Bad weight = "0,20,51", source = index.1.txt, line = 1 ERROR: FstHeader::Read: Bad FST header: standard input

best regards

AliiiiRezaaa - 2018-11-04 - 08:11

0 1 311 0 0,20,51 0 2 361 0 0,51,68 0 247 0 1 -1.09863,51,150 0 3 864 0 0,71,91 0 4 885 0 0,94,118 0 5 545 0 0,118,150 0 6 818 0 0,150,209 0 7 0 0 -2.2998,57,239

KyleGorman - 2018-12-29 - 19:33

This looks like you want to use triple weights, not just pairs.

The whole process is documented here: http://www.openfst.org/twiki/bin/view/FST/FstAdvancedUsage#NewArcs

You need to be ready to write C++ and to follow the examples of how other weight/arc types are created and registered, of course.

Log In

ComposeFst Memory Allocation

DaniloDeOliveira - 2018-08-24 - 16:04

Hi, I’m trying to manage the memory usage of a ComposeFst object for a lazy decoder. I monitor the size of the vector of known states in VectorCacheStore, as well as the composition state table, both not garbage collected, with the following methods:

*Vector of known states:*<br>

  • Inside VectorCacheStore:<br>
<verbatim>size_t GetExtraAllocBytes() { return state_vec_.size() * sizeof(State *) ; }</verbatim>

*State table:*<br>

  • Inside ComposeFstImpl:<br>
<verbatim>size_t GetStateTableSize(){ return GetStateTable()->Size()*sizeof(StateTuple) ; }</verbatim>

I use the sum of the output of these two methods along with the garbage collector limit in order to estimate the total memory allocated by ComposeFst and be able to re-initialize the ComposeFst object periodically. However, when I monitor memory usage via the top command, it doesn’t match this estimation, it says I’m using more memory than that. Am I missing anything?

MichaelRiley - 2018-08-24 - 22:41

Are you accounting for the size of each State that is pointed to (not fixed since it contains a vector of arcs)? Also for vector based components you'd expect a non-trivial difference between their size() and capacity(). Fragmentation could be another issue.

openfst-1.6.8.tar.gz checksum changed

SimonPodlesny - 2018-08-02 - 06:55

Hello,

Last time, when I was deploying OpenFst library in my project I did a checksum for download due to a insecure connection. Now, when I wanted to deploy project again, checksum is different. Is it possible that content of library was changed after it was made available for download? Last time when I downloaded library, current version was 1.6.8 and I copied sha256checksum from website (ba5a36662635eb68c202c0133d6137575342a5d507c2875fb0c859c5f199ead9), now sha256checksum is: af3f69ad3e32363e3a8c0e5953396ed35ee3130a0f27264b005879aeefd43236

KyleGorman - 2018-08-25 - 18:57

I don't know, but can you just use the more recent one that's up?

Log In

Implementing FST composition with "slop"

PiotrZelasko - 2018-06-29 - 11:40

Hi there!

Is there an option to perform FST composition with OpenFST which would allow some kind of "slop" parameter, similarly as in ElasticSearch (see the first example in attached link)? If not, what do you think would be the best approach to implement it? I'm thinking about lookahead matchers but perhaps there is a simpler way.

Reference: https://www.elastic.co/guide/en/elasticsearch/guide/current/slop.html

KyleGorman - 2018-07-12 - 00:09

I don't know how "slop" is implemented there but an obvious implementation of fuzzy matching with FSTs is described under FstExamples (see the section entitled "Edit Distance"). Here's an implementation using our Python wrappers:

https://github.com/kylebgorman/EditTransducer

That allows you to match between two strings of possibly infinite distance apart. If you only want to consider something a match when it's up to k characters (symbols) away, then there's an easier approach and it's in fact much faster at runtime: instead of composing with a cyclic edit transducer, you compose with a transducer that allows zero or one edits, and compose that with itself up to k times.

Log In

Binaries exit silently reading unknown FST weight type

KirillKatsnelson - 2018-05-30 - 08:26

Any binary that reads an .fst file (fstinfo, for example) just silently exits with exit code 1 when the weight type in the file is unknown. I think this started to happen after version 1.6.3.

The code in script/fst-class.cc checks if the returned reader function pointer is null, but for an unknown arc_type the returned pointer is to the NullReader function, which returns a null pointer when called.

The binary's main function then just exits without invoking the FSTERROR().

KyleGorman - 2018-06-13 - 20:49

Thanks, fixed.

Log In

Avoiding use of class values during backoff

JackNaru - 2018-05-22 - 15:07

This may be a simple question, but wanted to understand how we could avoid values from classes being used during backoff?

Log In

missing PROGRAM FLAGS in openfst-1.6.7 on cygwin

RolandSchwarz - 2018-05-18 - 07:30

Hi,

I've tried hard over the last days to compile openfst 1.6.7 on Cygwin and managed to get it to compile and link without error. However, when I try to use the command line tools, no 'PROGRAM FLAGS' are accessible. Here's (part of) the output from e.g. fstcompile:

fstcompile --help Creates binary FSTs from simple text format.

Usage: fstcompile [text.fst [binary.fst]]

PROGRAM FLAGS:

LIBRARY FLAGS:

Flags from: flags.cc --help: type = bool, default = false show usage information --helpshort: type = bool, default = false show brief usage information --tmpdir: type = string, default = "/tmp" temporary directory --v: type = int32, default = 0 verbosity level

Flags from: fst.cc --fst_align: type = bool, default = false Write FST data aligned where appropriate --fst_default_cache_gc: type = bool, default = true Enable garbage collection of cache --fst_default_cache_gc_limit: type = int64, default = 1048576 Cache byte size that triggers garbage collection --fst_read_mode: type = string, default = "read" Default file reading mode for mappable files --fst_verify_properties: type = bool, default = false Verify FST properties queried by TestProperties --save_relabel_ipairs: type = string, default = "" Save input relabel pairs to file --save_relabel_opairs: type = string, default = "" Save output relabel pairs to file

[...]

PROGRAM FLAGS are empty, preventing me to hand e.g. a symbol table to the binary. Interestingly, all flags that come from the library (fst.cc, symbol_table.cc, util.cc and weight.cc) are there.

Any help would be appreciated, I tried a lot already but can't make sense of it.

Thanks,

Roland

RolandSchwarz - 2018-05-18 - 16:12

From some preliminary debugging it seems that the registers which hold the flags change. For example if

auto bool_register = FlagRegister<bool>::GetRegister();

returns the bool FlagRegister at address X in fstcompile.cc, it also returns X in fstcompile-main.cc, but as soon as 'ShowUsage' in flags.cc is called the same command returns a different address Y.

So it seems the flags specific for fstcompile for example are put in a different register than the library flags and are never retrieved again.

KyleGorman - 2018-05-23 - 11:39

We are aware of this issue and it is fixed in an upcoming version.

Log In

why are we loosing 1 bit in expression of labels and states ?

PtiZoom - 2018-04-26 - 18:27

can someone explain why keys, indexes and states are not systematically unsigned ? thus we are loosing (MAX_UINT/2 -1) addressed data ? thx.

KyleGorman - 2018-05-23 - 11:43

The constants fst::kNoStateId and fst::kNoLabel, for instance, are both -1, and negative values should be treated as implementational details.

But the entire library is templated on a definition of arc so if you don't like this you could always write your own arc template like so:

template <class Weight> struct PreciseArcTpl { ssize_t ilabel; ssize_t olabel; Weight weight; ssize_t nextstate;

// yr constructor goes here. };

But IMO `int` tends to have enough precision for all but truly gigantic machines.

Log In

RandGen () with LogProbArcSelector question

KennethRBeesley - 2018-04-26 - 17:00

I have an FST with StdArc (Tropical Semiring). Using RandGen() with the default UniformArcSelector, I seem to get uniformly random results, as expected.

But when I try to use the LogProbArcSelector, e.g.

<verbatim> int seed = rand() % 57 ; // std::cout << "Rand: " << seed << std::endl; fst::LogProbArcSelector<fst::StdArc> selector(seed); fst::RandGenOptions< fst::LogProbArcSelector<fst::StdArc> > options(selector); fst::RandGen(fst, &random_path, options); </verbatim>

then it compiles and runs, but the results are not as expected. (I expect to get weighted-random results, biased by the relative weights of the paths in the FST.)

1. Am I thinking correctly about the LogProbArcSelector and what it's supposed to do? 2. Am I using it correctly? Does it work with StdArc FSTs? 3. Are there useful examples anywhere that I could study?

KennethRBeesley - 2018-04-26 - 17:10

I see that <verbatim>...</verbatim> didn't work very well. Here's another try with Markdown

I have an FST with StdArc (Tropical Semiring). Using RandGen() with the default UniformArcSelector, I seem to get uniformly random results, as expected. But when I try to use the LogProbArcSelector, e.g.

``` int seed = rand() % 57 ; // std::cout << "Rand: " << seed << std::endl; fst::LogProbArcSelector<fst::StdArc> selector(seed); fst::RandGenOptions< fst::LogProbArcSelector<fst::StdArc> > options(selector); fst::RandGen(fst, &random_path, options); ``` then it compiles and runs, but the results are not as expected. (I expect to get weighted-random results, biased by the relative weights of the paths in the FST.)

``` 1. Am I thinking correctly about the LogProbArcSelector and what it's supposed to do? 2. Am I using it correctly? Does it work with StdArc FSTs? 3. Are there useful examples anywhere that I could study? ```

Log In

building 1.6.7 problem

AlexanderRudnicky - 2018-03-20 - 10:50

I am getting the following error when I 'make' the project:

<verbatim> fst.cc:86:34: required from here ./../include/fst/util.h:202:3: error: unable to deduce 'const auto&' from '<expression error>' make[3]: * [fst.lo] Error 1 </verbatim>

I'm doing a vanilla build, with the following: <verbatim> wget http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.6.7.tar.gz

./configure CXX=g++47 --enable-static --enable-shared --enable-far --enable-ngram-fsts [also-->] ./configure CXX=g++47 make -j 4 [also-->] make </verbatim> The machine has: <verbatim> $cat /etc/*elease CentOS release 6.2 (Final) $g++47 -dumpversion 4.7.0 </verbatim>

I would want to assume that the distribution should have compiled w/o problem. I'm not a C++ person, so I'm not even sure what the error is about. Any ideas on what to try next? Thanks!

AlexanderRudnicky - 2018-03-20 - 11:01

hmm. The markup instructions above aren't clear. here's the error again: <verbatim> fst.cc:86:34: required from here = ./../include/fst/util.h:202:3: error: unable to deduce 'const auto&' from '<expression error>'= = make[3]: * [fst.lo] Error 1 = </verbatim>

KyleGorman - 2018-03-26 - 13:20

I haven't seen that before. I understand what it's saying (it's saying that some type deduction is failing on that line in util.h) but not why your compiler can't do type deduction there. Is it possible to upgrade your GCC beyond 4.7.0 (either by upgrading a package or your OS)? That version of GCC is 6 years old, which is a long time in C++ terms.

Successful cross-compilation of OpenFST, OpenGRM NGram and OpenGRM Thrax using MinGW

WincentBalin - 2018-03-15 - 16:31

Hello world! smile

I succeeded in compilation of the packages OpenFST/NGram/Thrax with MinGW using Docker. The will find the Git repository here: https://github.com/wincentbalin/compile-static-openfst The resulting MinGW binaries are static, both for win32 and for win64. You will find them in the list of releases: https://github.com/wincentbalin/compile-static-openfst/releases

I had to create a couple of patches, which I then put into the repository above. Some of them got obsolete already, and hence deleted. I hope that we might incorporate some of them into the main source code. I suppose it is much more feasible than trying to fork and adapt every single version to MSVC only to abandon it later. There are much too much of such repositories on GitHub.

I am looking forward to any question or opinion!

WincentBalin - 2018-03-15 - 16:34

Another question: may I post a link to this thread to the NGram and Thrax forums?

KyleGorman - 2018-05-23 - 11:45

Sure.

Log In

Linear FST error with other fst operation

MinseokKeum - 2018-03-07 - 03:36

Hello,

I tried to use linear fst extension.

I executed the below example as shown in http://www.openfst.org/twiki/bin/view/FST/FstExtensions.

fstlinear -vocab=vocab.txt -out=out.fst -start_symbol=NULL -end_symbol=NULL model1.txt model2.txt

It generated out.fst, but it cannot be accessed with fstinfo, or fstprint. It outputs error as below.

ERROR: GenericRegister::GetEntry: linear_tagger-fst.so: cannot open shared object file: No such file or directory ERROR: Fst::Read: Unknown FST type linear-tagger (arc type = standard): out.fst

I executed ldconfig, but cannot fix the problem. Is there any report on this problem?

KyleGorman - 2018-03-07 - 16:33

Find where linear_tagger-fst.so is located (probably something like /usr/local/lib/fst) and add it to your LD_LIBRARY_PATH. Then this should work. E.g.:

LD_LIBRARY_PATH=/usr/local/lib/fst fstinfo out.fst

MinseokKeum - 2018-03-08 - 01:31

Ok! Your advice solved the problem. There have been already a lot of LD_LIBRARY_PATH solution in this forum. Thank you.

Log In

Pynini and destructive operations

TanelAlumae - 2018-02-21 - 08:05

Hello,

I am trying to use Pynini in an NLP course for teaching FSTs. I find it great but I am a bit confused by the destructive operations in the Pynini API.

Let's say I have the following FSTs (inspired by Jason Eisner's homework at https://www.cs.jhu.edu/~jason/465/hw-ofst/hw-ofst.pdf):

import pynini as pn zero = pn.a("0") one = pn.a("1") bit = zero | one first = (zero + one + one + one + one.ques).optimize()

Now, a good way to rewrite 'first' seems to be something like:

first = (zero + one.closure(3,4)).optimize()

However, after executing this, 'one' is not any more /1/ but /1{3,4}/. I find this unintuitive. OK, I can use the copy method:

first = (zero + one.copy().closure(3,4)).optimize()

But is there a better way to do this, without using excessive calls to copy()?

KyleGorman - 2018-03-06 - 16:44

Hi Tanel, for all nearly all destructive methods of the form:

f.method(*args, **kwargs) # optionally returns `self`

there is an equivalent non-destructive form:

g = method(f, *args, **kwargs)

This just calls copy under the hood, of course.

BTW calling copy is not itself expensive. The major data classes in this library use reference-counting, copy-on-write semantics. The only time a deep copy happens is when you mutate something with a reference count > 1.

Log In

find N unique ShortestPaths for StdVectorFst using c++

VarunKumar - 2018-02-15 - 09:48

I am trying to find N unique shortest paths for a given StdVectorFst. What should be the value of "weight_threshold" for StdVectorFst in ShortestPathOptions?

fst::StdVectorFst* input = fst::StdVectorFst::Read(""); fst::StdVectorFst result;

//shortest path options fst::QueueType queue_type = fst::AUTO_QUEUE; const s::ShortestPathOptions shortest_path_opts(queue_type, n, true, fst::kDelta, weight_threshold, fst::kNoStateId);

KyleGorman - 2018-03-06 - 16:47

You should only set the weight threshold to something other than semiring zero if you also want to prune paths. See the documentation for Prune which goes over the associated semantics of pruning with a weight threshold.

Log In

weight type conversion in fst

IkeYuki - 2017-12-10 - 23:08

Is there a method to convert fst's weight? I want to get VectorFst<LogArc> from VectorFst<StdArc> such as HCLG.fst.

IkeYuki - 2017-12-15 - 07:19

I tried to use Arcmap, but an error occured. <verbatim> fst::MutableFst<fst::LogArc> *fst; fst::ArcMap<fst::LogArc,fst::WeightConvertMapper<fst::StdArc,fst::LogArc>>(*fst,fst::WeightConvertMapper<fst::StdArc,fst::LogArc>()); </verbatim>

<verbatim> No matching function for call to ArcMap(fst::MutableFst<fst::ArcTpl<fst::LogWeightTpl<float> > >&, fst::WeightConvertMapper<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::ArcTpl<fst::LogWeightTpl<float> > >)' </verbatim>

Any ideas for this? Thanks for advance.

KyleGorman - 2018-01-16 - 16:48

Hi! Yes, such methods exist at several levels. At the template library level, try something like (these aren't tested but should work, with at most minor changes):

 
fst::VectorFst<StdArc> std_fst;
// Populates this...
fst::VectorFst<LogArc> log_fst;
fst::ArcMap(std_fst, &log_fst, fst::StdToLogMapper());

Or, you can use the binary interface:

fstmap --map_type=to_log std.fst log.fst

Or, from Python:

log_fst = pywrapfst.arcmap(std_fst, map_type="to_log")

Log In

Intersect operation gives error on symbol tables mismatch

SanJoshi - 2017-11-16 - 04:31

I want to check if a string exists in a set of strings. I created two FSTs

The first FST contains the set of all strings as shown here http://www.openfst.org/twiki/bin/view/Forum/FstForum#Making_an_FST_with_C_How_to_defi

The second SearchFST contains a single string (i.e. State1 -> a/x -> State2 -> b/y -> FinalState)

The input and output symbol tables for both FST are the same.

Now I sorted both the FSTs, the first on output labels, the second on input labels

ArcSort(&set_fst, StdOLabelCompare()) ArcSort(&search_fst, StdILabelCompare())

When I call StdIntersectFst output_fst(set_fst, search_fst)

I get this error WARNING: CompatSymbols: Symbol table checksums do not match. Table sizes are 3 and 3 FATAL: ComposeFst: Output symbol table of 1st argument does not match input symbol table of 2nd argument

Slide 28 here http://www.openfst.org/twiki/pub/FST/FstSltTutorial/part1.pdf seems to depict that both FSTs in an Intersect have the same symbol table.

What am I doing wrong ?

Log In

Get raw arc arrays from VectorFst or ArcIterator.

JustinLuitjens - 2017-11-09 - 16:45

Is there a method to get the raw arc arrays from either an Fst or an ArcIter? I'd like to be able to parallelism across an ArcIterator loop but to do that I need a way to get each arc in constant time.

-- Michael Riley - 2022-11-11

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2022-11-12 - MichaelRiley
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback