9

I am asked to develop a software which should be able to create Flow chart/ Control Flow of the input Java source code. So I started researching on it and arrived at following solutions:

To create flow chart/control flow I have to recognize controlling statements and function calls made in the given source code Now I have two ways of recognizing:

  1. Parse the Source code by writing my own grammars (A complex solution I think). I am thinking to use Antlr for this.
  2. Read input source code files as text and search for the specific patterns (May become inefficient)

Am I right here? Or I am missing something very fundamental and simple? Which approach would take less time and do the work efficiently? Any other suggestions in this regard will be welcome too. Any other efficient approach would help because the input source code may span multiple files and can be fairly complex.

I am good in .NET languages but this is my first big project in Java. I have basic knowledge of Compiler Design so writing grammars should not be impossible for me.

Sorry If I am being unclear. Please ask for any clarifications.

Joachim Sauer
  • 302,674
  • 57
  • 556
  • 614
Sudh
  • 1,265
  • 2
  • 19
  • 30
  • 1
    Sounds like an interesting project. You might save yourself some work on the graphical end by using frameworks like Eclipse GMP http://www.eclipse.org/modeling/gmp/ If you're going for the text based approach (which might be sufficient, depending of the input complexity) you could make use of Java annotations http://download.oracle.com/javase/tutorial/java/javaOO/annotations.html – Jules Mar 31 '11 at 08:53
  • You don't want to build grammars, and try to construct this on you own. If you are good and have good tools this will take you a year. There are many Java parsers, some of which have control flow analysis as an option; use one of those and get on with your life. – Ira Baxter Mar 31 '11 at 20:33
  • I would be nice to select an answer. – jmg Apr 07 '11 at 10:23
  • @jmg: well that is tough.... cause all the approaches are different but equally applicable..this makes the choice of answer a subjective one.. – Sudh Apr 09 '11 at 19:42

9 Answers9

7

I'd go with Antlr and use an existing Java grammar: https://github.com/antlr/grammars-v4

Peter Knego
  • 79,991
  • 11
  • 123
  • 154
  • went through the link provided by you...downloaded the .g file...now do I need to open it using Antlr??? – Sudh Apr 02 '11 at 09:05
4

All tools handling Java code usually decide first whether they want to process the language Java or Java byte code files. That is a strategic decision and depends on your use case. I could image both for flow chart generation. When you have decided that question. There are already several frameworks or libraries, which could help you on that. For byte code engineering there are: ASM, JavaAssist, Soot, and BCEL, which seems to be dead. For Java language parsing and analyzing, there are: Polyglot, the eclipse compiler, and javac. All of these include a complete compiler frontend for Java and are open source.

I would try to avoid writing my own parser for Java. I did that once. Java has a rather complex grammar, but which can be found elsewhere. The real work begins with name and type resolution. And you would need both, if you want to generate graphs which cover more than one method body.

jmg
  • 7,308
  • 1
  • 18
  • 22
  • well we need to parse the source code...Polygot seems a good idea...but isnt it ll be a complex solution...provided our use case states that the input source code will be error free...another point is ow do I include it in my application??...hack into its source code and see whats happening?/ or something else?? – Sudh Apr 02 '11 at 08:47
  • @Sudh: It all comes down to the question, what do you need? You said, you can assume correct input. Well that's a good thing. But if you still need name and type resolution, e.g. to see which are the possible targets of a method call, then I'd go for a ready made compiler frontend. E.g. Polyglot, javac, or the eclipse java compiler. To the question of integration, is it possible that you integrate your tool into the other framework. Could your tool be a eclipse oder intellij plugin? Perhaps it's possible to configure a eclipse instance without gui. Polyglot is designed to be extended. – jmg Apr 03 '11 at 10:51
2

Eclipse has a library for parsing the source code and creating Abstract Syntax Tree from it which would let you extract what you want.

See here for a tutorial http://www.vogella.de/articles/EclipseJDT/article.html

See here for api http://help.eclipse.org/indigo/topic/org.eclipse.jdt.doc.isv/reference/api/org/eclipse/jdt/core/dom/package-summary.html#package_description

1

Now I have two ways of recognizing:

You have many more ways than that. JavaCC ships with a Java 1.5 grammar already built. I'm sure other parser generators ditto. There is no reason for you to either have to write your own grammar or construct your own parser.

And specifically 'read[ing] input source code files as text and search for the specific patterns' isn't a viable choice at all, as it isn't parsing, and therefore cannot possibly recognize Java programs correctly.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • Agree with you.....can I get more info about this nJAVA 1.5 grammar I mean how to use it and all....... – Sudh Apr 02 '11 at 08:35
  • @Sudh: Sorry to say this but if you aren't familiar with the concept of grammars and compiler generators you aren't equipped to do this project. This is not something you are going to learn in a forum. – user207421 Apr 02 '11 at 11:53
  • I dont know from where you got the idea....but I am familiar with the concept of grammars ..have worked on basic flex/bison for some time...but that was basic....and I understand that it is advanced stuff.but there is no fun without a challange isnt it?? – Sudh Apr 03 '11 at 09:11
  • @Sudh: I got the idea about your unfamiliarity with grammars from you. You can find JavaCC at http://javacc.java.net/. I think you could have found that for yourself frankly. – user207421 Apr 05 '11 at 04:22
0

Your input files are written in Java, and the software should be written in Java, but this is your first project in Java? First of all, I'd suggest learning the language with smaller projects. Also you need to learn how to use graphics in Java (there are various libraries). Then, you should focus on what you want to show on your graphs. Or is text sufficient?

michelemarcon
  • 23,277
  • 17
  • 52
  • 68
  • well I would say...I am familiar with the language...already done some smaller projects in it.db acesss and all.......familiar a bit with the swings......I want to show the control flow in my graphs...also the class structure of the source code... – Sudh Apr 02 '11 at 08:34
0

The way I would do it is to analyse compiled code. This would allow you to read jars without source and avoid parsing the code yourself. I would use Objectwebs ASM to read the class files.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • well..the use case says that it should be able to do it from source code files....cause they have future plans to extend it to other languages too.....So... – Sudh Apr 02 '11 at 08:49
  • Can these language becompiled to .class files as well. e.g. Groovy, SCala, JRuby, Jython etc.? A large percentage of Java code in applications is already compiled. You will need some way to analyse it. – Peter Lawrey Apr 02 '11 at 08:53
  • by that I meant they have future plans to extend this application for the analysis of source codes written in other languages like C/C++, JAVASCRIPT etc....so analyzing compiled code will not be an efficient solution.... – Sudh Apr 02 '11 at 09:01
0

Or even more easy: Use reflection. You should be able to compile the sources, load the classes with java classloader and analyse them from there. I think this is far more easy than any parsing.

Daniel Bişar
  • 2,663
  • 7
  • 32
  • 54
  • @sudh: After rereading the question i understand that it should not analyse the classes but the single control structures. I guess this is not possible with reflection... Maybe this is the reason for the downvote... – Daniel Bişar Apr 02 '11 at 11:53
0

Smarter solution is to use Eclipse's java parser. Read more here: http://www.ibm.com/developerworks/opensource/library/os-ast/

nanda
  • 24,458
  • 13
  • 71
  • 90
  • I am not sure whether I understood the article properly...In that they were creating a class from the compilation unit and all...but dont I need to do the reverse?? I mean I wil have source code of a JAVA program and what I need to do is creating its class structure and control flow..I may be wrong here... – Sudh Apr 02 '11 at 08:42
  • The link is presently dead – Michael Fayad Feb 15 '16 at 15:34
  • This link is dead, please remove the post or fix the link! – Exploring Jun 02 '21 at 04:31
0

Our DMS Software Reengineering Toolkit is general purpose program analysis and transformation machinery, with built in capability for parsing, building ASTs, constructing symbol tables, extracting control and data flow, transforming the ASTs, prettyprinting ASTs back to text, etc.

DMS is parameterized by an explicit language definition, and has a large set of preexisting definitions.

DMS's Java Front End already computes control and data flow graphs, so your problem would be reduced to exporting them.

EDIT 7/19/2014: Now handles Java 8.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • hmm....will give it a try...but how do I include (possibly merge) with my application.... – Sudh Apr 02 '11 at 08:59
  • You haven't been clear what you wanted to do with the flowgraph if you had it. In the absence of any specific requirements, the simplest scheme is to configure DMS to extract that information and simply launch it as a subprocess. – Ira Baxter Apr 02 '11 at 09:29
  • Its looks like an old unmaintained library though. – Exploring Jun 02 '21 at 04:28
  • How on earth did you draw that conclusion? DMS has been under active development and enhancement for 25+ years. We build all kinds of tools with it as well as carrying active migrations of million line systems in COBOL to Java. – Ira Baxter Jun 02 '21 at 04:47