Looking at the usage on a per-method basis, i.e. by analyzing all instructions, has some pitfalls. Besides method invocations, there might be method references, which will be encoded using an invokedynamic
instruction, having a handle to the target method in its bsm
arguments. If the byte code hasn’t been generated from ordinary Java code (or stems from a future version), you have to be prepared to possibly encounter ldc
instructions pointing to a handle which would yield a MethodHandle
at runtime.
Since you already mentioned “analysis of inheritance”, I just want to point out the corner cases, i.e. for
package foo;
class A {
public void method() {}
}
class B implements bar.If {
}
package bar;
public interface If {
void method();
}
it’s easy to overlook that A.method()
has to stay public
.
If you stay conservative, i.e. when you can’t find out whether B
instances will ever end up as targets of the If.method()
invocations at other places in your application, you have to assume that it is possible, you won’t find much to optimize. I think that you need at least inlining of bridge methods and the synthetic inner/outer class accessors to identify unused members across inheritance relationships.
When it comes class references, there are indeed even more possibilities, to make a per-instruction analysis error prone. They may not only occur as owner of member access instructions, but also for new
, checkcast
, instanceof
and array specific instructions, annotations, exception handlers and, even worse, within signatures which may occur at member references, annotations, local variable debugging hints, etc. The ldc
instruction may refer to classes, producing a Class
instance, which is actually used in ordinary Java code, e.g. for class literals, but as said, there’s also the theoretical possibility to produce MethodHandle
s which may refer to an owner class, but also have a signature bearing parameter types and a return type, or to produce a MethodType
representing a signature.
You are better off analyzing the constant pool, however, that’s not offered by ASM. To be precise, a ClassReader
has methods to access the pool, but they are actually not intended to be used by client code (as their documentation states). Even there, you have to be aware of pitfalls. Basically, the contents of a CONSTANT_Utf8_info
bears a class or signature reference if a CONSTANT_Class_info
resp. the descriptor index of a CONSTANT_NameAndType_info
or a CONSTANT_MethodType_info
points to it. However, declared members of a class have direct references to CONSTANT_Utf8_info
pool entries to describe their signatures, see Methods and Fields. Likewise, annotations don’t follow the pattern and have direct references to CONSTANT_Utf8_info
entries of the pool assigning a type or signature semantic to it, see enum_const_value and class_info_index…