0

I am running below simple program , I know this is not best way to measure performance but the results are surprising to me , hence wanted to post question here.

public class findFirstTest {

    public static void main(String[] args) {

        for(int q=0;q<10;q++) {
            long start2 = System.currentTimeMillis();
            int k = 0;
            for (int j = 0; j < 5000000; j++) {
                if (j > 4500000) {
                    k = j;
                    break;
                }
            }
            System.out.println("for value  " + k + " with time " + (System.currentTimeMillis() - start2));
        }
    }
}

results are like below after multiple times running code.

for value  4500001 with time 3
for value  4500001 with time 25 ( surprised as it took 25 ms in 2nd iteration)
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0
for value  4500001 with time 0

so I am not understanding why 2nd iteration took 25ms but 1st 3ms and later 0 ms and also why always for 2nd iteration when I am running code.

if I move start and endtime printing outside of outer forloop then results I am having is like

for value  4500001 with time 10
prasannajoshi
  • 147
  • 11
  • 1
    Micro benchmarks are very hard to get it right, specially if you have a black box like a JVM on top of that. Furthermore, use System.nanoTime() instead of System.currentTimeMillis() for a better precision – dreamcrash Dec 26 '20 at 15:14
  • I guess you've answered your question *[..]this is not best way to measure performance[..]*. Regarding your second point, you are running your loop 10 times, so it is not a surprise that it prints 10 (each inner loop takes close to 1ms). – Thiyagu Dec 26 '20 at 15:14
  • 2
    Please read: [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) --- I suspect that during the 2nd iteration of the outer loop, the JIT-compiler was activated. – Turing85 Dec 26 '20 at 15:17
  • Some internal optimization? – nice_dev Dec 26 '20 at 15:17
  • @dreamcrash yes I agree , to avoid such black box , I tried writing code in different method by passing argument for loop count and returned value from method to avoid optimisation , still same results with System.nanoTime() – prasannajoshi Dec 26 '20 at 15:21

2 Answers2

1

In first iteration, the code is running interpreted.

In second iteration, JIT kicks in, slowing it down a bit while it compiles to native code.

In remaining iterations, native code runs very fast.

Andreas
  • 154,647
  • 11
  • 152
  • 247
-1

Because your winamp needed to decode another few frames of your mp3 to queue it into the sound output buffers. Or because the phase of the moon changed a bit and your dynamic background needed changing, or because someone in east Croydon farted and your computer is subscribed to the 'smells from London' twitter feed. Who knows?

This isn't how you performance test. Your CPU is not such a simple machine after all; it has many cores, and each core has pipelines and multiple hierarchies of caches. Any given core can only interact with one of its caches, and because of this, if a core runs an instruction that operates on memory which is not currently in cache, then the core will shut down for a while: It sends to the memory controller a request to load the page of memory with the memory you need to access into a given cachepage, and will then wait until it is there; this can take many, many cycles.

On the other end you have an OS that is juggling hundreds of thousands of processes and threads, many of them internal to the kernel, per-empting like there is no tomorrow, and trying to give extra precedence to processes that are time sensitive, such as the aforementioned winamp which must get a chance to decode some more mp3 frames before the sound buffer is fully exhausted, or you'd notice skipping. This is non-trivial: On ye olde windows you just couldn't get this done which is why ye olde winamp was a magical marvel of engineering, more or less hacking into windows to ensure it got the priority it needed. Those days are long gone, but if you remember them, well, draw the conclusion that this isn't trivial, and thus, OSes do pre-empt with prejudice all the time these days.

A third significant factor is the JVM itself which is doing all sorts of borderline voodoo magic, as it has both a hotspot engine (which is doing bookkeeping on your code so that it can eventually conclude that it is worth spending considerable CPU resources to analyse the heck out of a method to rewrite it in optimized machinecode because that method seems to be taking a lot of CPU time), and a garbage collector.

The solution is to forget entirely about trying to measure time using such mere banalities as measuring currentTimeMillis or nanoTime and writing a few loops. It's just way too complicated for that to actually work.

No. Use JMH.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72