How to monitor object creation using java agent and ASM?

Question

What I want to do is to monitor the object creation and record a unique ID for that object.

Firstly I tried to monitor the NEW instruction but it can not work and throw VerifyError: (...) Expecting to find object/array on stack. I heard that the object after NEW is uninitialized so it can not be passed to other methods. So I abandoned this approach.

Secondly, I tried to monitor the invocation of <init>, this method initializes the uninitialized object. But I am not sure that after the initialization, if the initialized object will be pushed to the stack?

In my method visitor adapter:

public void visitMethodInsn(int opc, String owner, String name, String desc, boolean isInterface) {
    ...
    mv.visitMethodInsn(opc, owner, name, desc, isInterface);
    if (opc == INVOKESPECIAL && name.equals("<init>")) {
        mv.visitInsn(DUP);
        mv.visitMethodInsn(INVOKESTATIC, "org/myekstazi/agent/PurityRecorder", "object_new",
                "(Ljava/lang/Object;)V", false);
    }
}

In MyRecorder.java:

public static void object_new(Object ref){
    log("object_new !");
    log("MyRecorder: " + ref);
    log("ref.getClass().getName(): " + ref.getClass().getName());
}

I tried them in a demo, it throws VerifyError:

Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.VerifyError: Operand stack underflow
Exception Details:
  Location:
    AbstractDemo.<init>()V @4: dup
  Reason:
    Attempt to pop empty stack.
  Current Frame:
    bci: @4
    flags: { }
    locals: { 'AbstractDemo' }
    stack: { }
  Bytecode:
    0x0000000: 2ab7 0001 59b8 003b b1

        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
        at java.lang.Class.privateGetMethodRecursive(Unknown Source)
        at java.lang.Class.getMethod0(Unknown Source)
        at java.lang.Class.getMethod(Unknown Source)
        at sun.launcher.LauncherHelper.validateMainClass(Unknown Source)
        at sun.launcher.LauncherHelper.checkAndLoadMain(Unknown Source)

It seems not working as well. Are there any alternatives to monitor the object creation?

If you are fine with using other tools, there is library developed by Google. https://github.com/google/allocation-instrumenter — JojOatXGME, Jun 23 '20 at 21:56

score 4 · Accepted Answer · answered Jun 22 '20 at 08:21

The part of the message

Location:
  AbstractDemo.<init>()V @4: dup

hints at it: you are instrumenting a constructor. Within a constructor, invokespecial <init> is also used to delegate to another constructor, either in the same class or in the superclass.

The typical sequence for calling another constructor is aload_0 (this), push arguments, invokespecial <init>, so there is no reference to the object on the stack after the invocation.

This is how the decoded bytecode of the VerifyError looks like:

  0 aload_0
  1 invokespecial   [1]
  4 dup
  5 invokestatic    [59]
  8 return

Normally, you don’t want to report these delegating constructor calls, as they would cause reporting the same object multiple times. But identifying them can be tricky, as the receiver class is not a reliable criteria. E.g., the following is valid Java code:

public class Example {
    Example reference;
    Example(Example anotherObject) {
        reference = anotherObject;
    }
    Example() {
        this(new Example(null));
        reference.reference = new Example(this);
    }
}

Here, we have a constructor containing three invokespecial instruction having the same target class and the delegating constructor call is neither the first nor the last one, so there is no simple-to-check property of the instruction itself telling you this. You have to identify the target providing instruction as aload of index zero, i.e. this, to understand whether an instruction is initializing the current instance, which is nontrivial when there are argument providing instructions in-between.

That said, even outside the constructor there is no guaranty that the newly instantiated object is on the stack. It is usually the case when the instantiation is used in an expression context where the result is subsequently stored or used, but not in a statement context. In other words for a method like

void test() {
    new Example();
}

naive compiler implementations (like javac) may generate the equivalent to the expression code, followed by a pop instruction, but other implementations (like ecj) could elide the preceding dup in this case, eliminating the need for the subsequent pop, as no reference will be on the stack after the invokespecial <init> instruction.

A safer approach is to search for instruction sequences starting with new and leading to an invokespecial <init> (allowing nested occurrences). Then, inject a dup right after the new instruction and the invokestatic after the invokespecial.

score 4 · Answer 2 · answered Jun 22 '20 at 09:18

Instrumenting new objects at instantiation site can be quite tricky for the reasons described in Holger's answer. To instrument object allocations, agents usually go another way - they modify Object() constructor, since all normal constructors end up in calling Object() through a chain of super() constructors.

However, this will not catch all object allocations. If you care about arrays, you'll also need to instrument newarray, anewarray, multianewarray bytecodes.

Also, the native code and the JVM itself may create or clone objects without calling constructors. This needs to be handled separately with JVM TI.

For more information, take a look at this question »

Important: when instrumenting `Object()`, you need filters, as it is very likely that the Instrumentation causes object creations itself. — Holger, Jun 22 '20 at 10:00

How to monitor object creation using java agent and ASM?

2 Answers2