4

I'm studying compiler construction and naturally I'm also studying real world implementations of these concepts. One example of this is Babel's parser: Babylon.

I went through Babylon's code and it appears to be using a Top Down parser with embedded ad hoc semantic rules. src

I was expecting Babel to be using a member of the LR parsers and probably a definition file where the grammar productions are coupled together with semantic rules. Why? Well mostly because a bunch of other real world langs use lr parser generators such as Yacc, Bison, et al, that give you this exact interface, and seems to be a clearer and more maintainable way of representing these rules, and even more when you consider that Babel lives on the edge of the Javascript standard, implementing new things all the time.

I also have constructed both top down and bottom up (lr) parsers and I don't see a big implementation difficulty difference between the two (both are equally difficult :) )

So, why does Babel's parser uses a top down ad hoc syntax directed translations instead of what I see as a more structured approach? What are the design decisions behind that? What am I missing?

Thanks!

franleplant
  • 629
  • 1
  • 7
  • 17
  • Because English is read top to bottom. – StackSlave Oct 16 '17 at 23:36
  • 4
    @PHPglue That’s not what “[top-down](https://en.wikipedia.org/wiki/Top-down_and_bottom-up_design)” refers to. What does English have to do with this? – Sebastian Simon Oct 16 '17 at 23:40
  • This question probably isn't a good fit for SO since generally the answer is "because someone decided it should be". Also in this case Babylon started as a fork of https://github.com/ternjs/acorn so most architectural questions aren't Babylon-specific. – loganfsmyth Oct 16 '17 at 23:41

1 Answers1

12

I feel like you're really asking two (or maybe three) questions, so I'll address them separately

In general what are the advantages and disadvantages of different approaches to parsing

Top down vs. bottom up

For hand-written parsers the situation is actually pretty clear: Top-down parsers are much easier to write and maintain to the point that I've never even seen a hand-written bottom-up parser.

For parser generators the situation is less clear. Both types of parser generators exist (for example yacc and bison are bottom-up and ANTLR and JavaCC are top-down). Both have their advantages and disadvantages and I don't think there's much cause to say that one approach is clearly better than the other.

In fact I'd say it usually makes no sense to decide between top-down and bottom-up parsing. When hand-writing your parser, always go with the former. When using a parser generator, you should simply choose the tool whose features best fit your project, not based on whether it generates bottom-up or top-down parsers.

Hand-written parsers vs. parser generators

There are many reasons why one would hand-write parsers. These also depend on which parser-generators are even available for the language. One short-coming that parser generators often suffer from is that they make it hard to generate good error messages for syntax errors.

Another possible problem is that for non-context free languages you might need some dirty hacks to implement them using a parser generator or it might just not be possible at all.

How do these factors apply specifically to Babylon

Hand-written parsers vs. parser generators

The JavaScript grammar is quite complicated with a lot of special cases to resolve ambiguities. It would probably require extensive hacks when using a parser generator and might not be possible at all with the parser generators available for JavaScript.

I would also say that the parser generators available for JavaScript might not yet be production-ready and were even less so when the project was first created.

Top down vs. bottom up

As I said, I've never ever seen a hand-written bottom-up parser. So the decision to write a top-down parser is a no-brainer once you decide to go with a hand-written parser.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • Thanks a lot for your reply, probably the section about hand written parsers is exactly what I was looking for, the grammar is too complicated to fit good enough with parser generators. Thanks a lot! – franleplant Oct 18 '17 at 00:43