10

Is there a open source java api that allows to compare two Abstract Syntax Trees of java source code?

I would like to see the differences between the two syntax trees, similar to how it is done in diff tools.

Jon Egeland
  • 12,470
  • 8
  • 47
  • 62

3 Answers3

10

Yes, there are free implementations that output tree diffs:

GumTree (fast, multi language, integrates with git): https://github.com/GumTreeDiff/gumtree

ChangeDistiller (quite mature, built as a self contained library): https://bitbucket.org/sealuzh/tools-changedistiller/wiki/Home

CodingSpectator (AST diffing is hard-coded in the rest of the code): https://github.com/vazexqi/CodingSpectator/tree/codingtracker-ast-inference

cdmihai
  • 3,008
  • 2
  • 20
  • 18
8

Most diff tools compare lines, not syntax trees (see Wikipedia article for discussion).

There are some techical papers that talk about how to do syntax tree compares, e.g., Diff/TS: A Tool for Fine-Grained Structural Change Analysis

There are no APIs for computing tree differences available anywhere as far as I know. The problem is more complex than it first sounds, if you want to get a minimal diff. But the basic technique is to use some variation of Levenstein distance metrics.

We had to roll our own for our line of SmartDifferencers; fortunately, we have really good front ends for many langauges to produce accurate ASTs.

You end up with additional surprises, such as people that want to compare comments in spite of the fact that what you have are ASTs, wanting to compare broken files, to compare language dialects your grammar doesn't match, or codes that contain insertions of other languages, etc.. Do diff by lines doesn't have these issues, which is one reason line-diff is widespread and tree-diff is not.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
2

I wonder if there is an ANTLR extension somewhere that can do this....

http://www.antlr.org/

http://openjdk.java.net/projects/compiler-grammar/antlrworks/Java.g

Josh England
  • 182
  • 9
  • 2
    It's easy to get the AST of source files, it's another story to find out the similarities between them :). On of the local parsing gurus here at stackoverflow, [Ira Baxter](http://stackoverflow.com/users/120163/ira-baxter), talks about this in a [Google Tech Talk](http://www.youtube.com/watch?v=C-_dw9iEzhA) where he mentions that his software does this. – Bart Kiers Dec 12 '11 at 12:45
  • very true - it depends what you mean to 'compare'. Maybe it would be better to define, AST might not be the best thing to use. – Josh England Dec 13 '11 at 10:08