I'm trying to profile some code, but I'm running into some issues with what I think is Java's compiler being smart and changing the way the code works.
example methods:
public int method1(int bits)
{
// seed is a long object field
seed = seed * 0x5DEECE66DL + 0xBL & (1L << 48) - 1;
return (int) (seed >>> 48 - bits);
}
public int method2(int bits)
{
// seed is a long object field
seed *= 0x5DEECE66DL;
seed += 0xBL & (1L << 48) - 1;
return (int) (seed >>> 48 - bits);
}
public int method3(int bits)
{
// multiplier, seeds, n, and carry are long/long array object fields
final long t = multiplier * seeds[n] + carry;
// carry = t / b (done in an unsigned way)
final long div32 = t >>> 32; // div32 = t / (b+1)
carry = div32 + ((t & 0xFFFFFFFFL) >= 0xFFFFFFFFL - div32 ? 1L : 0L);
// seeds[n] = (b-1)-t%b (done in an unsigned way)
seeds[n] = 0xFFFFFFFEL - (t & 0xFFFFFFFFL) - (carry - div32 << 32) - carry & 0xFFFFFFFFL;
final long result = seeds[n];
n = n + 1 & r - 1;
return (int) (result >>> 32 - bits);
}
Now, to compare the performance, I'm using this test rig:
// gen is the object containing the methods/fields
int result;
long start = System.currentTimeMillis();
for (int i = 0; i < 0x7FFFFFFF; ++i)
{
result = gen.method1(32);
}
long end = System.currentTimeMillis();
System.out.println(end - start);
start = System.currentTimeMillis();
for (int i = 0; i < 0x7FFFFFFF; ++i)
{
result = gen.method2(32);
}
end = System.currentTimeMillis();
System.out.println(end - start);
start = System.currentTimeMillis();
for (int i = 0; i < 0x7FFFFFFF; ++i)
{
result = gen.method3(32);
}
end = System.currentTimeMillis();
System.out.println(end - start);
However, the results I'm getting back are rather odd:
7 3109 13402
method1
and method2
should both have roughly the same computation time as they compute the exact same thing. method3
I can understand taking longer, but looking at the amount of calculations, none of the 3 results seem long enough to me. This seems to flag to me that somehow Java's optimizing my test rig and not running it all 0x7FFFFFFF
times.
To fix this, I thought of declaring result
as a static volatile
field (outside of the test rig method), and doing that generated believable results:
21814 21468 26962
Now my question is is this the accepted way to profile methods? I would still like optimizations to occur in my methods I'm profiling, just not in the test rig (at least not optimizations which would result in the method being called less than the prescribed number of times or caching the results between runs).