Use ANTLR to find Variable usage/reference in Java source-code?

Question

A variable usage is basically every occurrence of a variable after its declaration in the same scope, where some operation may be applied to it. Variable usage highlighting is even supported in some IDEs like IntelliJ and Eclipse.

I was wondering if there is a way to find variable usages using ANTLR ? I have already generated the Lexer, Parser, and BaseListener classes by running ANTLR on Java8.g4. I can find variable declarations but am not able to find variable usages in a given Java source code. How can I do this ?

Example :

int i;    // Variable declaration
i++;      // Variable usage
i = 2;    // Variable usage
foo(i);   // Variable 'i' usage

I am able to capture the declaration but not usage using the Listener class. I am parsing Java source codes here.

There is a related question about [parsing context-sensitive languages in ANTLR](https://stackoverflow.com/questions/5126779/parsing-context-sensitive-language). — Anderson Green, Jun 08 '17 at 05:57
I am sorry, but I can't understand how to find locations of a variable used in the code from it. @AndersonGreen — Jarvis, Jun 08 '17 at 05:58

Jiri Tousek · Accepted Answer · 2017-06-08T12:25:58.760

I'll assume you're only considering local variables.

You'll need scopes and resolving to do this.

Scope will represent Java's variable scope. It will hold information about which variables are declared in the given scope. You'll have to create it when you enter a Java scope (start of block, method, ...) and get rid of it upon leaving the scope. You'll keep a stack of scopes to represent nested blocks / scopes (Java doesn't allow hiding a local variable in nested scope, but you still need to track when a variable goes out of scope at the end of a nested scope).

And then you'll need to resolve each name you encounter in the parsed input - determine whether the name refers to a variable or not (using scope). Basically, it refers to a local variable, whenever it is the first part of name (before any .), is not followed by ( and matches a name of a local variable.

Parser cannot do this for you, because whether a name refers to a variable or not depends on available variables:

private static class A {
    B out = new B();
}

private static class B {
    void println(String foo) {
        System.out.println("ha");
    }
}

public static void main(String[] args) {
    {
        A System = new A();
        System.out.println("a");
    }
    System.out.println("b");
}

ha
b

If you're considering also instance and static fields instead of just local variables, the resolving part becomes much more complicated, because you'll need to consider all classes in the current class' hierarchy, their instance and static fields, visibilities etc. to determine whether a variable of a given name exists and is visible.

Actually, using the Listener class of ANTLR, I was able to capture all the `expressionname`s representing variables as specified in `Java8.g4`. — Jarvis, Jun 09 '17 at 03:57
Does it work for the code in this answer, i.e. it only identifies `System` on the line with `"a"` as variable reference? — Jiri Tousek, Jun 09 '17 at 09:16
It correctly shows that 'System' is a variable on that line. I am not sure by what do you mean by `"a" as variable reference`. — Jarvis, Jun 09 '17 at 09:49
What I meant is that it only identifies `System` as a variable on that line (second occurrence), and identifies it correctly as a class reference in the two remaining cases. — Jiri Tousek, Jun 09 '17 at 14:11

Use ANTLR to find Variable usage/reference in Java source-code?

1 Answers1