6

When investigating an issue related to instantiation of Spring's org.springframework.util.ConcurrentReferenceHashMap (as of spring-core-5.1.3.RELEASE) I've used LinuxPerfAsmProfiler shipped along with JMH to profile generated assembly.

I simply run this

@Benchmark
public Object measureInit() {
  return new ConcurrentReferenceHashMap<>();
}

Benchmarking on JDK 8 allows to identify one of non-obvious hot spots:

  0.61%        0x00007f32d92772ea: lock addl $0x0,(%rsp)     ;*putfield count
                                                             ; - org.springframework.util.ConcurrentReferenceHashMap$Segment::&lt;init&gt;@11 (line 476)
                                                             ; - org.springframework.util.ConcurrentReferenceHashMap::&lt;init&gt;@141 (line 184)
 15.81%        0x00007f32d92772ef: mov    0x60(%r15),%rdx

This corresponds unnecessary assignment of default value to a volatile field:

protected final class Segment extends ReentrantLock {
  private volatile int count = 0;
}

and Segment is in turn instantiated in loop in constructor of CCRHM:

public ConcurrentReferenceHashMap(
    int initialCapacity, float loadFactor, int concurrencyLevel, ReferenceType referenceType) {
  this.loadFactor = loadFactor;
  this.shift = calculateShift(concurrencyLevel, MAXIMUM_CONCURRENCY_LEVEL);
  int size = 1 << this.shift;
  this.referenceType = referenceType;
  int roundedUpSegmentCapacity = (int) ((initialCapacity + size - 1L) / size);
  this.segments = (Segment[]) Array.newInstance(Segment.class, size);
  for (int i = 0; i < this.segments.length; i++) {
   this.segments[i] = new Segment(roundedUpSegmentCapacity);
  }
}

So the instruction is likely to be really hot. Full layout of assembly can be found in my gist

Then I run the same benchmark on JDK 14 and again use LinuxPerfAsmProfiler, but now I don't have any explicit pointing to volatile int count = 0 in captured assembly.

Looking for lock addl $0x0 instruction which is assignment of 0 under lock prefix I have found this:

  0.08%                          │  0x00007f3717d46187:   lock addl $0x0,-0x40(%rsp)
 23.74%                          │  0x00007f3717d4618d:   mov    0x120(%r15),%rbx

which is likely to correspond volatile int count = 0 because it follows constructor call of Segment's superclass ReentrantLock:

  0.77%                          │  0x00007f3717d46140:   movq   $0x0,0x18(%rax)              ;*new {reexecute=0 rethrow=0 return_oop=0}
                                 │                                                            ; - java.util.concurrent.locks.ReentrantLock::&lt;init&gt;@5 (line 294)
                                 │                                                            ; - org.springframework.util.ConcurrentReferenceHashMap$Segment::&lt;init&gt;@6 (line 484)
                                 │                                                            ; - org.springframework.util.ConcurrentReferenceHashMap::&lt;init&gt;@141 (line 184)
  0.06%                          │  0x00007f3717d46148:   mov    %r8,%rcx
  0.05%                          │  0x00007f3717d4614b:   mov    %rax,%rbx
  0.03%                          │  0x00007f3717d4614e:   shr    $0x3,%rbx
  0.74%                          │  0x00007f3717d46152:   mov    %ebx,0xc(%r8)
  0.06%                          │  0x00007f3717d46156:   mov    %rax,%rbx
  0.05%                          │  0x00007f3717d46159:   xor    %rcx,%rbx
  0.02%                          │  0x00007f3717d4615c:   shr    $0x14,%rbx
  0.72%                          │  0x00007f3717d46160:   test   %rbx,%rbx
                             ╭   │  0x00007f3717d46163:   je     0x00007f3717d4617f
                             │   │  0x00007f3717d46165:   shr    $0x9,%rcx
                             │   │  0x00007f3717d46169:   movabs $0x7f370a872000,%rdi
                             │   │  0x00007f3717d46173:   add    %rcx,%rdi
                             │   │  0x00007f3717d46176:   cmpb   $0x8,(%rdi)
  0.00%                      │   │  0x00007f3717d46179:   jne    0x00007f3717d46509
  0.04%                      ↘   │  0x00007f3717d4617f:   movl   $0x0,0x14(%r8)
  0.08%                          │  0x00007f3717d46187:   lock addl $0x0,-0x40(%rsp)
 23.74%                          │  0x00007f3717d4618d:   mov    0x120(%r15),%rbx

The problem is that I don't have any mention of putfield count in generated assembly at all.

Could anyone explain why I don't see it?

Sergey Tsypanov
  • 3,265
  • 3
  • 8
  • 34
  • 3
    Indeed, looks like some JIT optimization broke mapping between compiled code and bci. If run JDK 14 with `-XX:MaxInlineLevel=0`, "putfield count" annotation will become visible again. – apangin Aug 13 '20 at 18:10
  • @apangin thanks! Should I then report it to `hotspot-compiler-dev` or `jmh-dev` mailing lists? What do you think? – Sergey Tsypanov Aug 13 '20 at 20:29
  • That's certainly not a jmh fault. Debug info is missing in `-XX:+PrintAssembly`. So, `hotspot-compiler-dev` will be more appropriate. – apangin Aug 13 '20 at 21:53
  • @apangin it looks like this won't be fixed: https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2020-August/039513.html – Sergey Tsypanov Aug 14 '20 at 10:22
  • Yeah, quite expected. I can't agree with the justification though. While it's true that it's hard to maintain the mapping in general, this does not mean it's absolutely impossible in a particular case. Ignoring such problems with no even minimal investigation will eventually end up in no debug info at all. – apangin Aug 14 '20 at 11:42
  • @apangin btw, can this flag `-XX:MaxInlineLevel=0` distort the profile in the way that with disabled inlining some other optimizations can be disabled as well? And if it can, do you know any examples of such behaviour? – Sergey Tsypanov Oct 11 '21 at 07:48
  • It doesn't disable other optimizations directly, but definitely makes them less efficient, as the scope where optimizations can be applied gets notably reduced. See https://stackoverflow.com/a/52468276/3448419 – apangin Oct 11 '21 at 21:06

1 Answers1

1

It turned out that you couldn't use hsdis built for e.g. JDK 8 with JDK 11. For the perfect match you need to build hsdis from JDK sources, then build the JDK itself and run the application on this ad-hoc build.

This approach worked perfectly for me when I was investigating Missing bounds checking elimination in String constructor?.

Sergey Tsypanov
  • 3,265
  • 3
  • 8
  • 34