3

I am using JavaParser (open source) to parse the following code.

package testfiles.simple.tricky.before;

import testfiles.simple.before.InnerClassSample;

public class InnerClassReference {
    public void ref(InnerClassSample.MyInnerClass myInnerClass, java.util.List<Long> list) {
        int i = 0;
    }
}

Under the methodDeclaration node named ref, I get the parameter node hierarchy as follows:

// myInnerClass
Parameter
  ClassOrInterfaceType:
    ClassOrInterfaceType:
      SimpleName: InnerClassSample
    SimpleName: MyInnerClass
  SimpleName: myInnerClass
// list
Parameter
  ClassOrInterfaceType:
    ClassOrInterfaceType:
      ClassOrInterfaceType:
        SimpleName: java
      SimpleName: util
    SimpleName: List
  SimpleName: list

I need to find the fully qualified names for each parameter. So for myInnerClass I would get testfiles.simple.before.InnerClassSample$MyInnerClass and for list it would be java.util.List.

I understand that normally one would write List<Long> and put an import statement instead of writing java.util.List<Long>, however, I need to handle cases where the FQN is written in the parameter.

Now my question is, is there a way statically to distinguish if such a parsed tree is a type for a nested class, or it is merely a fully qualified name of a class?

I have thought about distinguishing by checking if the SimpleName starts with a lower case letter (meaning it is a package name), but, this is only a convention so we cannot assume safely that a developer would always start a package name with a lower case letter, or start a class name with a capital letter, hence I do not think this is a good way.

Any idea or insight about this matter would be much appreciated.

Lii
  • 11,553
  • 8
  • 64
  • 88
roksui
  • 109
  • 1
  • 8
  • I think that is this specific case you can tell that `MyInnerClass` is a nested class (or interface), because there is an import of `InnerClassSample`. But if you have two qualified names without any imports then I'm pretty sure there is no way to determine this. – Lii Jan 10 '22 at 08:30
  • I think maybe you're use the term "static" wrong? "Static" is often used to mean "during compilation". It is possible to know this information during compilation (that's what javac do), but only by analysing multiple source files at once. Your question seem to be about how this can be done by analysing only a single source file. – Lii Jan 10 '22 at 08:33
  • @Lii What I meant by static is the latter (analysing multiple source files at once). That is my bad for confusing the term. So does that mean JavaParser (without the use of its JavaSymbolSolver) is technically NOT a 'static' parser/analyzer? – roksui Jan 10 '22 at 08:36
  • I think it's right to say that JavaParser provides static information so it is a static analyser. But it does not provide ALL possible static information, just the information that is provided by parsing source code to an AST. JavaSymbolSolver provides a bit more static information. Other more advanced tools provide even more static information. – Lii Jan 10 '22 at 08:43
  • @Lii Right, so we can conclude that from an AST, if two trees have the same syntax, we would need additional information to infer/distinguish the types. I think the best I can do for now is to cover both cases and give two possible FQNs for a parameter that has multiple `ClassOrInterfaceType` nodes. – roksui Jan 11 '22 at 00:48

1 Answers1

2

It is unfortunately not possible to distinguish between the name of a top-level class and the name of a nested class from only the qualified name, or by analysing only a single source file. At least not in all cases.

To make that distinction you have to perform a resolution step, to find out what the names reference. This necessarily involves multiple source files.

The resolution step probably involves looking at the information in the AST or the class files of the referenced elements.


Note: Even if the resolution step involves multiple source files that information is still static information.

Lii
  • 11,553
  • 8
  • 64
  • 88