1

I'm trying to implement javascript parser and I stuck in a problem. Javascript has a technique called "semicolon insertion". So what is a common way to deal with automatic semicolon insertion in js parsers?

Should I rewrite original grammar to make it be able to deal with auto semicolons? Is it possible?

Or should I implement parser for original grammar and then use some tricky technique to deal with semicolons ?

All suggestions are welcome.

sergeyz
  • 1,308
  • 2
  • 14
  • 23
  • 1
    I think you just have to make your parser [follow the process described in the spec](http://www.ecma-international.org/ecma-262/5.1/#sec-7.9). – Pointy Jan 12 '13 at 17:27
  • @Pointy, but "as is" rules implementation will slow down the performance: I have to hold some parser state (like "last read token is line terminator" or "last read token is right curly bracket") and make statement parsing twice in worst case (try to parse tokens "as is" and then try to parse with auto inserted semicolons). – sergeyz Jan 12 '13 at 18:13
  • Yes I agree; JavaScript is a nasty grammar to parse. The regex issue is similarly evil. The only ideas I've ever had involve some creative tweaks to the lexer, so that these weird cases are actually recognized as special tokens. I don't know whether that could really work however. – Pointy Jan 12 '13 at 19:10
  • How does it slow down performance? Aside from the small set of "restricted productions", you only need to insert `;` if the current token cannot be accepted, so you can just add the semicolon instead of flagging an error (and then flag an error if the semicolon can't be accepted either). – rici Jan 12 '13 at 19:12
  • There is a related question here http://stackoverflow.com/questions/27069755/how-to-implement-javascript-automatic-semicolon-insertion-in-javacc – Theodore Norvell Oct 23 '15 at 21:00

0 Answers0