Question
Why is the use of fBuffer1
in the attached code example (SELECT_QUICK = true
) double as fast as the other variant when fBuffer2
is resized only once at the beginning (SELECT_QUICK = false
)?
The code path is absolutely identical but even after 10 minutes the throughput of fBuffer2
does not increase to this level of fBuffer1
.
Background:
We have a generic data processing framework that collects thousands of Java primitive values in different subclasses (one subclass for each primitive type). These values are stored internally in arrays, which we originally sized sufficiently large. To save heap memory, we have now switched these arrays to dynamic resizing (arrays grow only if needed). As expected, this change has massively reduce the heap memory. However, on the other hand the performance has unfortunately degenerated significantly. Our processing jobs now take 2-3 times longer as before (e.g. 6 min instead of 2 min as before).
I have reduced our problem to a minimum working example and attached it. You can choose with SELECT_QUICK
which buffer should be used. I see the same effect with jdk-1.8.0_202-x64
as well as with openjdk-17.0.1-x64
.
Buffer 1 (is not resized) shows the following numbers:
duration buf1: 8,890.551ms (8.9s)
duration buf1: 8,339.755ms (8.3s)
duration buf1: 8,620.633ms (8.6s)
duration buf1: 8,682.809ms (8.7s)
...
Buffer 2 (is resized exactly 1 time at the beginning) shows the following numbers:
make buffer 2 larger
duration buf2 (resized): 19,542.750ms (19.5s)
duration buf2 (resized): 22,423.529ms (22.4s)
duration buf2 (resized): 22,413.364ms (22.4s)
duration buf2 (resized): 22,219.383ms (22.2s)
...
I would really appreciate to get some hints, how I can change the code so that fBuffer2
(after resizing) works as fast as fBuffer1
. The other way round (make fBuffer1
as slow as fBuffer2
) is pretty easy. ;-) Since this problem is placed in some framework-like component I would prefer to change the code instead of tuning the hotspot (with external arguments). But of course, suggestions in both directions are very welcome.
Source Code
import java.util.Locale;
public final class Collector {
private static final boolean SELECT_QUICK = true;
private static final long LOOP_COUNT = 50_000L;
private static final int VALUE_COUNT = 150_000;
private static final int BUFFER_LENGTH = 100_000;
private final Buffer fBuffer = new Buffer();
public void reset() {fBuffer.reset();}
public void addValueBuf1(long val) {fBuffer.add1(val);}
public void addValueBuf2(long val) {fBuffer.add2(val);}
public static final class Buffer {
private int fIdx = 0;
private long[] fBuffer1 = new long[BUFFER_LENGTH * 2];
private long[] fBuffer2 = new long[BUFFER_LENGTH];
public void reset() {fIdx = 0;}
public void add1(long value) {
ensureRemainingBuf1(1);
fBuffer1[fIdx++] = value;
}
public void add2(long value) {
ensureRemainingBuf2(1);
fBuffer2[fIdx++] = value;
}
private void ensureRemainingBuf1(int remaining) {
if (remaining > fBuffer1.length - fIdx) {
System.out.println("make buffer 1 larger");
fBuffer1 = new long[(fIdx + remaining) << 1];
}
}
private void ensureRemainingBuf2(int remaining) {
if (remaining > fBuffer2.length - fIdx) {
System.out.println("make buffer 2 larger");
fBuffer2 = new long[(fIdx + remaining) << 1];
}
}
}
public static void main(String[] args) {
Locale.setDefault(Locale.ENGLISH);
final Collector collector = new Collector();
if (SELECT_QUICK) {
while (true) {
final long start = System.nanoTime();
for (long j = 0L; j < LOOP_COUNT; j++) {
collector.reset();
for (int k = 0; k < VALUE_COUNT; k++) {
collector.addValueBuf1(k);
}
}
final long nanos = System.nanoTime() - start;
System.out.printf("duration buf1: %1$,.3fms (%2$,.1fs)%n",
nanos / 1_000_000d, nanos / 1_000_000_000d);
}
} else {
while (true) {
final long start = System.nanoTime();
for (long j = 0L; j < LOOP_COUNT; j++) {
collector.reset();
for (int k = 0; k < VALUE_COUNT; k++) {
collector.addValueBuf2(k);
}
}
final long nanos = System.nanoTime() - start;
System.out.printf("duration buf2 (resized): %1$,.3fms (%2$,.1fs)%n",
nanos / 1_000_000d, nanos / 1_000_000_000d);
}
}
}
}