Code generating JUnit based on Abstract Syntax tree walk

Question

Assuming I have the following class and method:

package generation;

class HelloWorld {
  public boolean isEven(int val) {
    if ( (val % 2) == 0)
      return true;
    else
      return false;
  }
}

Assume I want to generated the following JUnit test:

package generation;

import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;

import org.junit.Test;

public class HelloWorldTest {

    @Test
    public void testIsEven() {
        HelloWorld h = new HelloWorld();
        assertTrue(h.isEven(2));
        assertFalse(h.isEven(1));
    }
}

Given the following method of tree walking a Java Syntax Tree: How can I use the java Eclipse Abstract Syntax Tree in a project outside Eclipse? (ie not an eclipse plugin)

How would you code generate the Unit test case given the class example up the top?

Perhaps I misunderstood but it seems this isn't really a question it's an interesting idea and you'd like someone else to implement it for you. — z7sg Ѫ, Mar 23 '11 at 12:21
What about nested branches or expressions in your `if` statement which do not use literals? How should the JUnit be created? — David Weiser, Mar 23 '11 at 23:26
I'm interested in the simplest thing that could possibly work. I'm aware the complexity of this could explode quite quickly - so just want to see the simplest case. — hawkeye, Mar 23 '11 at 23:48
I hope you realise the folly of generating the test from the code?! — davmac, Mar 24 '11 at 04:03
@davmac - I assume you're saying that TDD is the 'correct' approach. I contend that this would be useful in a legacy code environment. — hawkeye, Mar 27 '11 at 10:16
@hawkeye, I'm saying that generating a test from existing code will only test that the code does what it does. Not that it does what it is supposed to do. — davmac, Mar 28 '11 at 00:41
@hawkeye, to be clear: generated tests will test implementation details as opposed to conformance to the specification or interface. If there are bugs in the implementation, a test might be generated which passes, but for which a correct implementation of the tested code a test failure would occur. — davmac, Mar 28 '11 at 00:49
@davmac - I hear you on that. I'm suggesting that the primary risk for legacy code, broken or not - is whether it ceases its existing behaviour or not (not whether it is or is not correct, because once it has lasted in production for a few years - making changes to it is actually the greater risk.) People work around broken code because the cost of change can be so high. — hawkeye, Mar 28 '11 at 01:25
@hawkeye - I really am not seeing it. There is no point having such tests which only "test" code that will never be changed. I.e. if you can't afford to change the behaviour, then you mustn't change the code; and in that case, the test is pointless. — davmac, Mar 28 '11 at 05:34
@davmac I was making a distinction between "using junit tests to test against changes you've intentionally introducing to the code under the tests" and "using junit to protect against tangential changes not intended to change the code under the test but end up doing so anyway" - ie insurance for legacy code. (Imagine a codebase of 10M + lines and 70+ coders with 10+ projects running concurrently in a very conservative company.) — hawkeye, Mar 29 '11 at 09:35

score 2 · Accepted Answer · answered Mar 24 '11 at 02:51

You need a lot more than just a parse tree. (This mistake is repeated by practically everybody that hasn't built a serious program analysis tool; you need a lot more machinery to do anything really interesting. It is why compilers aren't trivial).

The "simplest" case requires parsing the code of the method to be tested into ASTs, name/type resolving everything so you know the meaning of all the symbols (you have to know that val in an integer), and determining the control flows through the code, and the predicates that control them.

With that information, you can essentially enumerate valid control-flow paths, picking up information about the predicates along the path for each one, forming in essence a conjunction of all the conditions along that path. (In your example, if .. val%2 ... return true; is one path, controlled by val%2==true). You get to worry about modelling how side effects in the path affect the various predicates. And you'd like to range information on integers (and sizes of strings and arrays, etc.).

Then for each path, you need to generate a set of input arguments that makes the path predicate true; given that this predicate could be pretty complicated, you'll likely need some kind of SAT solver. With solutuions to the path predicate, you now need to generate ASTs corresponding to to tests (e.g., set up variables to enable the method arguments to satisy the predicate; for simple integer equations, you can likely just generate expressions for the arguments as in your example). Finally, assemble the test calls into an AST for a method, insert into an AST representing a unit test case method, and prettyprint the result.

Well, that wasn't so hard :-}

Our DMS Software Reengineering Toolkit has a Java front end that will parse Java, produce ASTs, enumerate control flow paths through a method [this isn't so easy: consider exceptions], compute range constraints on integer variables, and give you the general ability to climb around the ASTs to extract/construct what you want. Doesn't include a SAT solver yet, but we've thought about it.

that sounds pretty awesome. Let me know if you've got any positions open :) — hawkeye, Mar 28 '11 at 01:22

score 0 · Answer 2 · answered Mar 23 '11 at 22:52

0

You might want to skip the awkward, hardly well-defined part of your questions and look into property based testing.

answered Mar 23 '11 at 22:52

Raphael

9,779
5
63
94

I believe Agitar had a tool that did that. It just looked at method signatures and fired in edge case values. I'm interested in the simplest case of code analysis->code generation. – hawkeye Mar 23 '11 at 23:50

Code generating JUnit based on Abstract Syntax tree walk

2 Answers2

Linked