2

I'm trying to measure if instance of really fast. Here is very simple benchmark:

public Object a = 2;

@Benchmark
@Warmup(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
@Measurement(iterations = 5, timeUnit = TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
public boolean test() {
    return a instanceof Double;
}

I ran this bench

Benchmark              Mode  Cnt  Score   Error  Units
MyBenchmark.test       avgt    5  3.105 ± 0.086  ns/op

The assembly output of the benchmark is too long, omitted.

I also wrote a simple java program

private static Object i = 123;

public static boolean insOf(){
    return i instanceof Double;
}

public static void main(String[] args) throws IOException {
    for (int i = 0; i < 100000000; i++)
        if(insOf())
            System.out.print("");
}

The assembly output of the compiled insOf method is this:

0x00007fd761114b60: mov     %eax,0xfffffffffffec000(%rsp)
  0x00007fd761114b67: push    %rbp
  0x00007fd761114b68: sub     $0x10,%rsp        ;*synchronization entry
                                                ; - com.get.intent.App::insOf@-1 (line 24)

  0x00007fd761114b6c: movabs  $0xd6f788e0,%r10  ;   {oop(a 'java/lang/Class' = 'com/get/intent/App')}
  0x00007fd761114b76: mov     0x68(%r10),%r11d  ;*getstatic i
                                                ; - com.get.intent.App::insOf@0 (line 24)

  0x00007fd761114b7a: mov     0x8(%r11),%r10d   ; implicit exception: dispatches to 0x00007fd761114b9c
  0x00007fd761114b7e: cmp     $0x20002192,%r10d  ;   {metadata('java/lang/Double')}
  0x00007fd761114b85: jne     0x7fd761114b98          <-------- HERE!!!
  0x00007fd761114b87: mov     $0x1,%eax
  0x00007fd761114b8c: add     $0x10,%rsp
  0x00007fd761114b90: pop     %rbp
  0x00007fd761114b91: test    %eax,0x16774469(%rip)  ;   {poll_return}
  0x00007fd761114b97: retq
  0x00007fd761114b98: xor     %eax,%eax
  0x00007fd761114b9a: jmp     0x7fd761114b8c          <------- HERE!!!
  0x00007fd761114b9c: mov     $0xfffffff4,%esi
  0x00007fd761114ba1: nop
  0x00007fd761114ba3: callq   0x7fd7610051a0    ; OopMap{off=72}
                                                ;*instanceof
                                                ; - com.get.intent.App::insOf@3 (line 24)
                                                ;   {runtime_call}
  0x00007fd761114ba8: callq   0x7fd776591a20    ;*instanceof
                                                ; - com.get.intent.App::insOf@3 (line 24)
                                                ;   {runtime_call}

Tons of hlt instructions omitted.

From what I can see, instance of is about ten of assembly instructions with two jumps (jne, jmp). The jumps are a little bit confusing. Why do we need them?

QUESTION: Is Java instance of really so fast?

St.Antario
  • 26,175
  • 41
  • 130
  • 318
  • 4
    Not entirely sure what you are asking. Is it really _how_ fast? As fast as somebody claims? Fast enough to be used extensively? Or whether there is a faster alternative? – tobias_k Oct 24 '17 at 14:23
  • Please be aware of [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) and [What is microbenchmarking?](https://stackoverflow.com/questions/2842695/what-is-microbenchmarking). – Zabuzard Oct 24 '17 at 14:24
  • 1
    @tobias_k The question is about number of instructions. As far as I can count, just 19 instructions (including jumps). But I'm not quite sure about how many jumps required. What it depends on. – St.Antario Oct 24 '17 at 14:25
  • @Zabuza Yes, I know, Thank you. – St.Antario Oct 24 '17 at 14:26

2 Answers2

2

Well, you're dabbling in disassembly, so you probably will have to reconstruct the functions represented by those jumps. Might be a good idea to use a JVM whose source code is available and for which you have the debugging symbols.

Just based on the semantics of instanceof, I'd wage that what these jumps do is execute the same test recursively for superclasses (since basically instanceof can be written in functional pseudocode as instanceof(object, class) = class != null and (object.class == class or instanceof(object.class.superclass, class) or instanceof(any object.class.interface, class))

Piotr Wilkin
  • 3,446
  • 10
  • 18
1

Yes, it is that fast for most cases, including the trivial case like yours. It first optimistically checks for the exact hit on the type (Double in your case), and then falls back to slow branch which would pessimistically call the runtime, because class hierarchy might need to be walked. But, in Double case, the compiler knows that no subclasses of Double exist in the system, and therefore slow branch is trivially "false".

   mov     0x8(%r11),%r10d    ; read the klass ptr
   cmp     $0x20002192,%r10d  ;   klass for 'java/lang/Double'
   jne     NOT_EQUAL          ; not equal? slow branch
   mov     $0x1,%eax          ; return value = "true"
RETURN:
   add     $0x10,%rsp         ; epilog
   pop     %rbp
   test    %eax,0x16774469(%rip) 
   retq
NOT_EQUAL:
   xor     %eax,%eax          ; return value = "false"
   jmp     RETURN             
Aleksey Shipilev
  • 18,599
  • 2
  • 67
  • 86
  • Thank you for the clarification. But I have one more question about it. You marked the line `mov 0x8(%r11),%r10d` as `read the klass ptr`. But in the original assembly output it was marked as `implicit exception: dispatches to 0x00007fd761114b9c`. Dispatches to the instruction followed after uncoditional jump which follows the `runtime_call`? – St.Antario Oct 25 '17 at 10:26
  • `%r11` is the object address. Klass ptr resides in its header at offset `0x8`. The implicit exception is the part of implicit null checking, which will SEGV if `%r11` is `NULL`, and the VM mechanics would dispatch that back to the normal code at the address you see in the comment. – Aleksey Shipilev Oct 25 '17 at 11:57