What is going on in this java benchmark?

Question

I'm using junit to run a primitive benchmark like this:

@Test
public void testTime() throws Exception{
    LoadoutBase<?> loadout = new LoadoutStandard("AS7-D-DC");

    final int iterations = 1000000;

    long start = System.nanoTime();
    int sum = 0;
    for(int i = 0; i < iterations; ++i){
        Iterator<Item> it = loadout.iterator();
        while( it.hasNext() ){
            sum += it.next().getNumCriticalSlots();
        }
    }
    long end = System.nanoTime();

    long time_it = end - start;

    start = System.nanoTime();
    sum = 0;
    for(int i = 0; i < iterations; ++i){
        for(Item item : loadout.getAllItems()){
            sum += item.getNumCriticalSlots();
        }
    }
    end = System.nanoTime();

    long time_arrays = end - start;
  
    System.out.println("it: " + time_it + " array: " + time_arrays + " diff: " + (double)time_it/time_arrays);
}

If I set iterations=1000000 then I get

it: 792771504 array: 1109215387 diff: 0.7147137637029551

very consistently but if I set iterations=10000 then I get

it: 32365742 array: 28902811 diff: 1.1198129482976587

with very wild fluctuations. The diff parameters is anywhere from 0.7 to 1.2

Could any one shed some light on what might be happening here? Which method should I choose?

Edit:

What I really am benchmarking is behind the scenes work. getAllItems creates a new List<Item> and populates it by getting all items from 16 sublists using addAll. The Iterator approach doesn't construct this temporary list but rather keeps track of in which of these 16 sublists it is currently iterating and has some logic to make the 16 sublists look like a continuous list.

Try switching the order of the two loops and see what happens. I have the feeling that you'll get similar results. — Steve P., Aug 23 '14 at 20:14
Yeah but it gets more exagregated in favour of the iterator one. — Emily L., Aug 23 '14 at 20:26

score 4 · Accepted Answer · edited May 23 '17 at 12:21

4

Since you want to test the difference between using Iterator and using enhanced for loop (which uses Iterator behind the scenes for you), then you're doing it wrong. First because JIT has enough time to improve the results on the second approach rather than in the first and several other reasons explained here: How do I write a correct micro-benchmark in Java?. You could see very different results of this by doing these (again, as result of JIT):

Add a loop that will increase the number of times to execute both iterations. This is, a for loop that covers all this code.
Moving the enhanced for loop approach to be executed before.

The best way to obtain real results for your test would be to split the approaches into different methods, then measure each (apart from the JVM warm up phase and other stuff covered in prev QA). Also, I recommend you to not reinvent the wheel and use a proper benchmark framework based on JUnit. Here are two of them:

Related Q/As for benchmarking:

edited May 23 '17 at 12:21

Community

1
1

answered Aug 23 '14 at 20:21

Luiggi Mendoza

85,076
16
154
332

What I really am benchmarking is behind the scenes work. `getAllItems` creates a new `List` and populates it by getting all items from 16 sublists using `addAll`. The `Iterator` approach doesn't construct this temporary list but rather keeps track of in which of these 16 sublists it is currently iterating and has some logic to make the 16 sublists look like a continuous list. – Emily L. Aug 23 '14 at 20:29
@EmilyL. well, but that seems to be tested by using two approaches. Split them into two methods, this is, into two unit tests, each for an approach. After that, use a benchmark framework or consider/implement all the rules stated in the Q/A in the first paragraph of this post. – Luiggi Mendoza Aug 23 '14 at 20:30
I was hoping to not have to fiddle with my build system. Just wanted a quick yes/no answer on if the iterator approach is better. I'll see if I can modify my benchmark according to the tips in that link. – Emily L. Aug 23 '14 at 20:56
@EmilyL. some times there are yes/no answers for better performance alternatives/algorithms, sometimes don't. When not, you have to benchmark it or use a profiler to check the results. – Luiggi Mendoza Aug 23 '14 at 20:57
@LuiggiMendoza: ...and targeting JUnit helps proper benchmarking how? – Aleksey Shipilev Aug 24 '14 at 17:22
@AlekseyShipilev it is just an alternative. If you feel you can propose another, then provide an answer. – Luiggi Mendoza Aug 24 '14 at 17:23
@LuiggiMendoza: [at]leventov already did proposed one. I'm just trying to understand how targeting JUnit helps benchmarking, since this seems to be your premise? – Aleksey Shipilev Aug 24 '14 at 17:44
@AlekseyShipilev OP already uses JUnit for benchmark, then just add a framework and you can turn your JUnit tests into benchmark tests. As easy as that, no need to do anything else. All those are options, you just have to feel free to choose which to use and feel more comfortable. – Luiggi Mendoza Aug 24 '14 at 17:48
@AlekseyShipilev plus, by the time I wanted to do my own micro benchmarks, there was no JMH available to play with, so I haven't tested this option yet. – Luiggi Mendoza Aug 24 '14 at 17:49
@LuiggiMendoza: But I thought the benchmarking is about avoiding the benchmarking pitfalls, and frameworks are different in these regards, which is not always about what is "more comfortable". – Aleksey Shipilev Aug 24 '14 at 17:53
@AlekseyShipilev among all non more comfortable options, I found one of these frameworks the best for me. We don't have the same tastes, so you're free to choose what you want/need. Otherwise, there would be a single hardware architecture that supports a single unique perfect programming language with a single debugger (if needed because the language itself would be so perfect that bugs fixes itself) and more and more... but that's utopia, similar to finding a unique tool that satisfies all the specific needs for everybody. – Luiggi Mendoza Aug 24 '14 at 17:55
@LuiggiMendoza: Truly. Want to play a game? You show me the JUnit-something based benchmark, and I show why it's incorrect. :) – Aleksey Shipilev Aug 24 '14 at 18:37
@AlekseyShipilev if you have something to say, just say it. We're here to learn and help each other, not to show who knows more than other people or similar silly games. – Luiggi Mendoza Aug 24 '14 at 19:09
@LuiggiMendoza: That is not a silly game, that's an educational exercise in benchmarking, because most people learn when they do things with their hands. If you want to help people, then go ahead and replicate these in JUnit-based benchmarks: http://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/, then we can talk how helpful it is to recommend JUnit-based benchmarking approaches to people. – Aleksey Shipilev Aug 24 '14 at 19:16
@AlekseyShipilev if I have some spare time, I could do. And the silly game was for your sentence: *You show me the JUnit-something based benchmark, and I show why it's incorrect*. **It is not incorrect, is just another approach**. Understand it! Also, for my specific needs, I cannot simply write a main method and make it work for my environment, it is way more complex and the algorithms were already covered with JUnit, so why to have all that work when I can simply add one of these libraries and adapt it to my needs? It's a matter of choice, and for now I don't need JMH. Good if you do. – Luiggi Mendoza Aug 24 '14 at 19:20
Indeed, going to the construction site without a hard cap can also be regarded as "just another approach". And most of the time, nothing bad is going to happen, luring you into the false sense of security. Some might even argue it is not incorrect, is just another approach to the security. OSHA will disagree though :) – Aleksey Shipilev Aug 24 '14 at 19:24

score 0 · Answer 2 · answered Aug 23 '14 at 21:08

After reading the links from Luiggi Mendoza, I refactored my test code like this:

@Test
public void testTime() throws Exception{

    LoadoutBase<?> loadout = new LoadoutStandard("AS7-D-DC");

    long start, end, sum;
    final int iterations = 10000;

    //WARMUP
    start = System.nanoTime();
    sum = 0;
    for(int i = 0; i < iterations; ++i){
        Iterator<Item> it = loadout.iterator();
        while( it.hasNext() ){
            sum += it.next().getNumCriticalSlots();
        }
    }
    end = System.nanoTime();
    //ENDOF WARMUP

    start = System.nanoTime();
    sum = 0;
    for(int i = 0; i < iterations; ++i){
        Iterator<Item> it = loadout.iterator();
        while( it.hasNext() ){
            sum += it.next().getNumCriticalSlots();
        }
    }
    end = System.nanoTime();
    long time_it = end - start;


    // WARMUP
    start = System.nanoTime();
    sum = 0;
    for(int i = 0; i < iterations; ++i){
        for(Item item : loadout.getAllItems()){
            sum += item.getNumCriticalSlots();
        }
    }
    end = System.nanoTime();
    // ENDOF WARMUP


    start = System.nanoTime();
    sum = 0;
    for(int i = 0; i < iterations; ++i){
        for(Item item : loadout.getAllItems()){
            sum += item.getNumCriticalSlots();
        }
    }
    end = System.nanoTime();
    long time_arrays = end - start;

    System.out.println("it: " + time_it + " array: " + time_arrays + " diff: " + (double)time_it/time_arrays);
}

Now I get consistent results saying that the iterator approach is about 0.7 times the execution speed of the getAllItems approach. The results are consistent even if I change the order of the tests so I trust them for this test.

Thanks.

What is going on in this java benchmark?

Edit:

2 Answers2

Linked