1

I have question about my program. My program should be to able create UML diagram from Java code, but I don't know, how design method, which will retrieves (load) Java keywords, objects etc. Notice: I can not use automated programs to create UML diagram. It is my thesis.

My idea was create enum class with Java keywords, that are seen in UML diagram and check all loaded code with this enum. But there are several problems, which I am not able to solve, especially spaces. Next problem is following:

For example I have code:

[space][space]public[space]class[space][space][space]SomeClass[space]{
[empty line]
private int something;
public BufferedReader br;
private ArrayList<File> al;
}

Comments for this code:

  • [space] represents the classical gap in the code.
  • I have highlighted the gaps on purpose because it is syntactically correct.
  • I don't know how do I load all Java classes as BufferedReader, ArrayList, etc or other objects.

Thank you for any reply. I suppose there is a better way to solve this problem.

Jørgen R
  • 10,568
  • 7
  • 42
  • 59
avalagne
  • 359
  • 3
  • 7
  • 15

4 Answers4

1

You should think about using a library which parses Java code for you. Certainly I couldn't tell you one, try google for one.

Here is the approach which I would follow. First read this article about the Eclipse Abstract Syntax Tree. Using the Eclipse AST would imply you are designing your tool as an eclipse plug-in. If you don't want to do this the article would give you some hints of how to parse a source tree nevertheless.

Kai
  • 38,985
  • 14
  • 88
  • 103
  • Thank you for reply. I use Netbeans IDE. I'm not sure if now would be good to go to Eclipse. And unfortunately, it can be only pure Java files, not compilate code. It might be a solution reading after a white character, but I do not know how I found out the names of attributes, classes, etc. – avalagne Apr 19 '12 at 14:39
  • @avalagne: Netbeans provides a [comparable API](http://wiki.netbeans.org/Java_DevelopersGuide). This solution doesn't need compiled code and would give you names of attributes, classes, etc. Have you read the article I mentioned? – Kai Apr 19 '12 at 15:11
  • Yes, I read it. If I understand, it's based on some sort of AST, which analyzes Java code. Thank you again. I'll try to figure out how it works in Netbeans TreeMaker. – avalagne Apr 19 '12 at 16:30
1

If you can use compiled code, that would be nice. In java you can load a class without initialisation, and inspect its structure with java.lang.reflect. Of course parameter names will be missing for instance.

For more details there are alternative class parsing libraries like ASM.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
1

Adding to the two previous answers and your comments, particularly re. 'pure java code'.

Assuming I understand your question right, the first thing you have to do is transform the java source code (i.e. text files) into some data structures. From there you can generate UML diagrams for the data structures.

Assuming so that's a pretty common pattern. There are generally 2 approaches for converting text to data structures:

  • parsing the text (as suggested by @user714965)
  • using reflection (as suggested by @Joop Eggen)

Hand-writing a parser is not a trivial affair. Your comment about creating an enum class etc. suggests that's what you're thinking. However a hand-coded parser would only be the recommended solution in a very few cases. There's a whole body of theory and practice dedicated to parsing algorithms and techniques. I'm really not sure you want to get in to that for your project.

Most people would use a parser-generator (e.g. antlr) to generate the parser from a grammar definition. Given the popularity of java, there's at least one existing java grammar for antlr. I'm not quite sure what the 'pure java code' constraint means. Antlr generates a pure java code parser so that would be ok. If you mean you need to write all the code from scratch then using a parser generator would be out. But that seems a very strange constraint...

Anyway. Your other option is using reflection. That in effect uses the parser in the JVM and gives you API access to query & navigate the code itself. java.lang.reflect is also (obviously) pure java - so your code calling it would be too.

The Eclipse/Netbeans API will provide you another possible route. In effect they are just another 'parser' that provides a set of data structures representing your java code.

I'd strongly recommend one of those three approaches instead of writing your own parser by hand.

I'm not sure if that helps. Perhaps you could explain the 'pure java code' constraint a bit more.

sfinnie
  • 9,854
  • 1
  • 38
  • 44
  • I will try to better describe what is my task. Name theme is 'Analysis of source code in Java applications'. The work description is such: The goal is lexical and syntactic analysis of Java. The data will be used to build UML model project analyzed and stored in an XML-file. And it is also stated: 1) Selection of the host language analyzer, 2) Build a scanner (lexical analyzer), 3) Build parser (syntax analyzer), 4) Save the information obtained in the form of XML, 5) UML graphical representation of the analyzed project. – avalagne Apr 19 '12 at 17:02
  • This is a description of my work. To be honest, I just did not know how to do it simply, if possible. So I devised various solutions as a white sign reading, etc. My supervisor is unfortunately an older gentleman, who wants to just sit in school, so I'll be glad for any idea that I could use. Of all the options you have to mention I like best about parsing text, but I'll keep deal. Thank you! – avalagne Apr 19 '12 at 17:02
  • hmm, ok. Well, if one of the goals is to learn how parsers work then fair enough. And if so then I'd recommend doing some reading on parser theory & practice. There are lots of good books available. But I would double-check with your supervisor that he is expecting you to write a parser from scratch. Using a parser-generator will probably teach you as much (maybe more) about the concepts of lexical & syntactic analysis as hand-writing. Good luck! – sfinnie Apr 19 '12 at 17:15
0

Simply choose a parser generator like JavaCC (your task 1). Use a finished Java grammar (or build an own one). Generate Lexer (2) and write your own Parser stuff into the grammar (3). Save the tree from your parser and/or modify it first to get an XML/XMI representation. For 5) you should really choose an existing tool as writing an own one can be a complete additional thesis...

Christian
  • 13,285
  • 2
  • 32
  • 49
  • Thank you for help. Such a rough draft I needed... Do you have any experience with any particular Java parser? – avalagne Apr 20 '12 at 09:51
  • JavaCC and antlr are both really good. Read this thread: http://stackoverflow.com/questions/382211/whats-better-antlr-or-javacc , or search for both name in you favorite search engine to bring up more of it – Christian Apr 20 '12 at 15:24
  • Thank you, I read this topic. I am looking for parser, which is where I learn as quickly as possible. From what I read, is easier JavaCC. – avalagne Apr 21 '12 at 08:07