10

I was reading a lot about Haskell Parser Combinators and found a lot of topics like:

But all these topics compare Parser Combinators with Parser Generators.

I want to ask you which of Parser Combinator suits best the following conditions:

  1. I want to have good control about the errors (including error recovery) and messages for user
  2. I want to be able to fed the parser with small parts of text (not whole file at once)
  3. I want to be able to redesign nicely the grammar (I'm currently developing the grammar, so "nice waf of working" is important"
  4. The final parser should be fast (the performance is important, but not as much as points 1-3).

I've found out, that the most popular parser combinators are:

Community
  • 1
  • 1
Wojciech Danilo
  • 11,573
  • 17
  • 66
  • 132
  • I only know of parsec, which is a monadic parser. It is fairly easy to use and you have good control over errors. – Jocke Aug 03 '13 at 01:12
  • 3
    I have only used `attoparsec` so I don't know first hand about Parsec. attoparsec has a reputation for being extremely fast but not so great on the error messages front. It's targeted at back end parsing needs that a front end user should never see error messages from. – asm Aug 03 '13 at 01:33
  • @AndrewMyers, I agree, I've used attoparsec for just that sort of thing and it is very fast and simple (especially for simple-ish grammars). It doesn't come with a lot of fancy functions like the normal parsec library on hackage but it supports `Text` which is awesome. – Wes Aug 03 '13 at 03:57
  • `attoparsec` error messages are completely unusable, but it is very fast. – Gabriella Gonzalez Aug 03 '13 at 03:58
  • I find the documentation for uu-parsinglib sparse when it comes to examples of anything non-trivial. Parsec is much better served in this regard. – OllieB Aug 03 '13 at 11:33
  • @OllieB: can you give an example of how you've found the uu-parsinglib docs lacking? I've always found them to cover everything I've needed. – John L Aug 03 '13 at 22:53
  • 2
    `uu-parsinglib` is, to my understanding, well equipped for 1, 2, and 3 all the way to the point of suggesting the correct syntax to the user (and even inputing it automatically, though that can be annoying). The documentation is best gotten by reading ["Combinator Parsing: A Short Tutorial"](http://www.cs.tufts.edu/~nr/cs257/archive/doaitse-swierstra/combinator-parsing-tutorial.pdf). – J. Abrahamson Aug 04 '13 at 04:35
  • You also may like to look at [`parsers`](http://hackage.haskell.org/package/parsers) and [`trifecta`](http://hackage.haskell.org/package/trifecta). – J. Abrahamson Aug 04 '13 at 04:36
  • @JohnL: Well, I had great trouble understanding how to use it with alex providing tokens. I managed, so it is all there if you're willing to read through the source, but as I said in the earlier comment, its not covered by examples, so it's quite involved to pick up. Another example would be using it monadically. There are no examples of that, as far as I can see, and the first I knew about about having to use addLength with monadic use was error messages at run time. – OllieB Aug 04 '13 at 09:37
  • Thank you. @J.Abrahamson: did you use `parsers` or `trifecta`? Could you elaborate a little bit more why they could be better than `Parsec` or `uu-parsinglib`? – Wojciech Danilo Aug 07 '13 at 18:45
  • I've been experimenting with `parsers`/`trifecta` for a little bit. I'm not certain that they're significantly *better* that uu-parsinglib though. As far as I've seen, familiarity should be the major decision factor. Though, if you're seen one parsing combinator library you've pretty much seen them all. – J. Abrahamson Aug 07 '13 at 18:47
  • @J.Abrahamson - thank you - I'll try them both :) Additional - it seems, that you're familliar with `uu-parsinglib` - could you please tell me (if it is possible) - does `uu-parsinglib` has more advanced / better features than `parsec`? I understand, that if it can suggest the syntax, I do not have to use tools like `Alex` with it? – Wojciech Danilo Aug 07 '13 at 19:00
  • `uu-parsinglib` does somewhat more than `Parsec` "out of the box". The best way to get an appreciation of the differences is to read that tutorial I linked above. – J. Abrahamson Aug 07 '13 at 19:06
  • @J.Abrahamson: Is there a way to disable automatic inputting correct syntax to the user? (I've read the article you linked above - not all, but enough to grasp the uuparsinglib ideas). – Wojciech Danilo Aug 12 '13 at 14:34
  • Hm, looking at the code I think the main interface is `parse`, which is rather simple. It's pretty straightforward to implement another `P` type though it's a bit of work---[the source of Core](http://hackage.haskell.org/packages/archive/uu-parsinglib/2.8.1/doc/html/src/Text-ParserCombinators-UU-Core.html#parse) has examples for future parsers, history parsers, recognizers, partial parsers, and defaults---but it sadly doesn't export the `T` type which contains them and the `parse` and `parse_h` types are oversimplified—what you want to do is replace the `eval :: Steps a -> a` function. – J. Abrahamson Aug 12 '13 at 15:57
  • In particular, the `eval` `Fail` branch and `get_cheapest` calls are what do the rewriting. – J. Abrahamson Aug 12 '13 at 15:58
  • @J.Abrahamson: Thank you :) I'll look into it. Could you please also take a look at this question - I've asked it because I'm feeling a little confused how should the architecture of `lexer + parser` be done in `uu-parsinglib`: http://stackoverflow.com/questions/18214179/combining-lexer-and-parser-in-a-parser-combinator – Wojciech Danilo Aug 13 '13 at 16:23

2 Answers2

3

I would say definitely go with Parsec, heres why:

Attoparsec is designed to be quick to use, but lacks the strong support for error messages you get in Parsec, so that is a win for your first point.

My experience of using parser combinator libraries is that it is really easy to test individual parts of the parsers, either in GHCi or in tests, so the second point is satisfied by all of them really. Lastly, Attoparsec and Parsec are pretty darn fast.

Finally, Parsec has been around longest and has many useful and advanced features. This means that general maintainability is going to be easier, more examples are in Parsec and more people are familiar with it. uu-parsinglib is definitely worth the time to explore, but I would suggest that getting familiar with Parsec first is the better course for these reasons. (Alex is also the most recommended lexer to use with Parsec or otherwise, but I have not used it myself.)

Vic Smith
  • 3,477
  • 1
  • 18
  • 29
  • Good point. I've deleted my half of the conversation. This message will self-destruct in five seconds, give or take a day. – AndrewC Aug 07 '13 at 21:12
3

I will post my answer here in case somebody find this question. Current answer is quite outdated.

It's better to use megaparsec package as parser combinator library. It's a modern production-ready library. And its README.md contains excellent comparison with other parser combinators libraries:

Shersh
  • 9,019
  • 3
  • 33
  • 61