0

I am trying to find something that will parse very large files (PGN files, basically.) I started using antlr4, but even though they claim that their classes are "streams", they aren't. antlr4 takes my 5,457,518 game test file and tried to load the entire 1.7G file into a gigantic string, and then parse it, causing an out of memory crash. Thus, I threw it out and am now trying moo/nearley.

Well, I have a similar problem it seems. Even though both moo and nearley provide methods that have a so-called "chunk" as a parameter, moo in particular fails to realize that it's at the end of it's string and could get more on the next moo.feed.

My test program, for example, tries to send this to moo, two bytes at a time: [Abcde "bc def"]. It spits out LBRACKET correctly But then it spits out A as a symbol. If I do a moo.reset(next_two), it then spits out bc as a second symbol.

So my question is, how exactly do you, master lexer/parser, do this? Should I go back to antlr4? Should I use moo/nearley in a different way? Is there a better lexer/parser out there? I really don't want to have to write my own from scratch, but I'm really starting to wonder if there is any other way.

Sid110307
  • 497
  • 2
  • 8
  • 22
  • What language do you wish to write your parser in? – rici Apr 15 '20 at 01:06
  • Ha! I tried so hard to make sure I was as complete as I was able to be. I am writing in node (meteor, javascript.) Apologies for that. – David Logan Apr 15 '20 at 04:47
  • Actually you can handle very big input with antlr4. It needs different setup and you disable the creation of the parse tree https://stackoverflow.com/questions/16432469/is-it-possible-to-parse-big-file-with-antlr – Alkis Kalogeris Apr 22 '20 at 22:43

0 Answers0