1

I have downloaded Antlr 3.3 and antlr works, along with Java.g from the Antlr site. I was able to successfully generate the JavaParser, JavaLexer.java and tokens using Antlr works for Java.g. I then mounted the antlr jar in my IDE and was following the following instructions to use it in my code: http://www.antlr.org/wiki/pages/viewpage.action?pageId=789

The first problem arose when the documentation above says to code the following line:

RuleReturnScope result = parser.compilationUnit();

The problem is that parser.compilationUnit() does not return a result.

Then i tried following the example further down under "Parsing a tree", but this is incomplete.

I can't find any good documentation on how to use this library.

Here is what I want to do: -in Java submit a file name, File object or file contents as a String to Antlr and have it return some sort of object that i can navigate in code that will give me things like the imports, methods, class variables, expressions etc.

Basically what the NetBeans IDE 6.9.1 does to for refactoring, like fix imports, go to source, rename variables etc. All i need is the meta data about the class, and i am not sure how to obtain it.

Thanks.

Coder
  • 1,375
  • 2
  • 20
  • 45
  • There are various `Java.g` grammars lying around: can you post your (entire!) modified `.g` grammar file? And what exactly are you trying to extract from `.java` source files? – Bart Kiers Feb 09 '11 at 11:13
  • My grammar is the one from antlr and can be found here: http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g I am trying to extract what i described in my message above (imports, method names, variable names, and method content, themselves parsed as expressions with variables etc) The most important for now is imports but i would like the rest too. – Coder Feb 09 '11 at 11:19
  • Do you really need to write this from the ground up? Is this an (academic) exercise, or do you need to do this for your work? If it's the latter, I'd strongly suggest using 3rd party tools or libraries that already do this. For example, it is quite tricky to rename variables in source file: you will need to keep track of all the scopes in the source file, and that's not even half of it when there are public/protected variables that are used by other source files. But, if it's an exercise, I can imagine you doing it the "hard" way, but then at least start with Parr's book about ANTLR. – Bart Kiers Feb 10 '11 at 11:10

1 Answers1

1

To answer your question why compilationUnit does not return any result is because if you don't specify a return type on a rule yourself, the parser rule just gets translated into

public void nameOfTheParserRule() { 
    ... 
} 

If you set the output to AST in the options { ... } section of your grammar, parser rules by default return an object of type RuleReturnScope, which I suspect went wrong in your case, but I am not able to be certain since you did not post your grammar and the link you gave to the grammar Java.g does not generate anything: one needs to do certain things with it before being able to generate a parser and lexer from it. You most likely did the wrong things :).

Best of luck!

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • I'm not sure what you mean by "does not generate anything". I am able to generate a parser, lexer and token file using antlr works from it. As far as modifying it, i have no clue what to do. Since i need meta data about imports, variables, and methods, can you post in more detail what i need to modify on the grammar link i sent you to make this happen? Is this a difficult process? Also once i make the modifications, could you post some java code to parse a java file and extract the imports and methods names for eg? – Coder Feb 10 '11 at 10:23
  • I was also trying to find the grammar that NetBeans IDE used as they seem to use antlr and the IDE can identify imports, methods, and variables because i can refactor methods, variables etc in the IDE. I need the same level of granularity in my application. I'm very new to antlr and i'm not sure how hard it is to get what I want from it. So the more examples you supply, the more useful it will be for me. Thanks! – Coder Feb 10 '11 at 10:24
  • The code i used in Java came directly from antlr examples (the link i sent in my post) and looks like the following: 'JavaLexer lexer = new JavaLexer(cs); CommonTokenStream tokens = new CommonTokenStream(); tokens.setTokenSource(lexer); grammars.output.JavaParser parser = new grammars.output.JavaParser(tokens); parser.typeArgument();' // parser.compilationUnit();, and yes it does not generate a return value from compilation unit as you said. – Coder Feb 10 '11 at 10:26
  • Ok, i tried setting the options to include "output=AST;" and now i get a return value, but it's tree variable is null, and so is the start and stop text. Can you provide some more assistance in obtaining the meta data i need? Thanks. – Coder Feb 10 '11 at 10:37
  • @Coder: you're trying to run before being able to walk. I really don't see how I can explain all this to you since all the terminology I'd use will be foreign to you (or a big part of it). Your first big problem is that you want an AST from the source, which I explained in detail here: http://stackoverflow.com/questions/4931346/how-to-output-the-ast-built-using-antlr If you have that working, look into the ANTLR API docs to see how to iterate the AST (hint: look at the docs of CommonTree for that). Good luck! – Bart Kiers Feb 10 '11 at 10:50
  • ok thanks. I'm reading the link you sent right now. Once i get the AST from the source and parse the tree using CommonTree will I have everything I need or are there more steps? Also is there any other documentation you are aware of that explains the steps from start to finish including some of the terminology? I took this in University a few years ago so I know some of the terminology and how grammars work but i'm not sure of how antlr implemented things and some of the terminology i forgot. thanks again for pointing me in the right direction. – Coder Feb 10 '11 at 10:58
  • @Coder: ah yeah, there *is* source being generated, sorry about that. But there are also *many* errors reported that need resolving! I wouldn't use these .java files before being able to "cleanly" generate a parser and lexer from the grammar. – Bart Kiers Feb 10 '11 at 10:58
  • @Coder, also note the instruction *"This is a merged file, containing two versions of the Java.g grammar. To extract a version from the file, run the ver.jar with the command provided below. ..."* in the comments of that grammar file. So it seems the grammar is not okay "as is". – Bart Kiers Feb 10 '11 at 11:02
  • @Coder, no, I meant that specific ANTLR terminology would be foreign to you. – Bart Kiers Feb 10 '11 at 11:05
  • Yeah i just realized that. There is a problem in the importDecleration section. It can't resolve "IDENTIFIimportDeclarationER". Do you know where i could get a good version of the grammar? Perhaps borrow the one from the NetBeans IDE (although i looked through the source and could not find the grammar)? BTW, thanks again for all the assist. – Coder Feb 10 '11 at 11:06
  • Hmm, i'm trying to run the command you found in the grammar, but I don't know what version.jar they are talking about. I got the grammar from here http://www.antlr.org/grammar/list it was the first one in the list. – Coder Feb 10 '11 at 11:13
  • @Coder, you're welcome. This grammar (of Java 1.5), does generate a parser and lexer without warnings: http://www.antlr.org/grammar/1207932239307/Java1_5Grammars/Java.g – Bart Kiers Feb 10 '11 at 11:17
  • Ok i had some success with this one. Thanks for finding it. I was able to look at the children of the RuleReturnScope from the line: RuleReturnScope result = parser.compilationUnit(); and it seemed to parse the imports but it yielded a child of "import" with a child of "." with two children of "." and "JavaLexer" for the import line "import grammars.output.JavaLexer;" I guess i have to tweak it using the "^" and "!" to return "grammars" and "output" or even better, the whole import "grammars.output.JavaLexer". – Coder Feb 10 '11 at 11:44
  • @Coder, good to hear that. I recommend playing around with it a bit, and if you get stuck, ask a specific question here, or on ANTLR's mailing list. Getting hold of [the Definitive ANTLR reference](http://www.amazon.com/dp/0978739256/?tag=stackoverfl08-20) doesn't hurt either, of course. Good luck! – Bart Kiers Feb 10 '11 at 12:02