0

I want to write parser for JavaScript.

What I figured it out that I need to use some sort of scanning each character and as soon as I interact with any {, I must track for the next } (closing braces). For efficient usage I can use stack. Can anyone suggest me some better idea or approach to build a parser for JavaScript with Java?

Mat
  • 202,337
  • 40
  • 393
  • 406
Abhishek Choudhary
  • 8,255
  • 19
  • 69
  • 128
  • This would help for a simple parser, but if you want it to be effective and fast it should use much more intelligent ways. In college I remember a course called "Compilers & Assemblers" all talking about such stuff. – mohdajami Sep 06 '11 at 08:25
  • I wouldn't like to go for those book as of now , then what algorithm I may follow – Abhishek Choudhary Sep 06 '11 at 08:30
  • Then this simple way should do for now, and also consider Regex and Pattern matching. – mohdajami Sep 06 '11 at 08:31
  • 4
    No, *don't* consider regexps. That way lies madness. JavaScript is just as **impossible** to parse with regexps as [HTML](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Regular expressions can validate regular languages, but JS is context free and ***can not be parsed with regexps***. – gustafc Sep 06 '11 at 08:38
  • 1
    And before anyone brings up using those regex extensions that allow recognizing more than regular languages: No, those *may* work in theory but are just as much madness in practice. Just write a goddamn parser (the parser can be a few hundred lines of Python!). –  Sep 06 '11 at 10:26

4 Answers4

2

You may want to investigate using ANTLR - it is a tool which will allow you to generate parser classes in java or other languages, based on a grammar file which you write. You will likely be able to find a grammar (or at least a partial grammar) for javascript online.

Antlr home page with tutorials - http://www.antlr.org/

If you're not familiar with the concept of grammars you may need to read up on them and on compiling; a good first start would likely be wiki: http://en.wikipedia.org/wiki/Formal_grammar

mcfinnigan
  • 11,442
  • 35
  • 28
1

ANTLR is the de-facto standard for building parser (not only in Java), and is also very easy to use (including Eclipse plugin). Seems like there are some readily available grammars for JavaScript.

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
1

There already is a complete JavaScript engine written in Java, names Rhino. Obviously it has to include a parser, and it's open source, so you could have a look how it's done there.

I suspect that you'll find that parsing a language such as JavaScript is much more complex than you expect.

Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
0

There's also JavaCC

Andrew Fielden
  • 3,751
  • 3
  • 31
  • 47