3

There are probably hundreds of questions about Java Collections vs. arrays, but this is something I really didn't expect.

I am developing a server for my game, and to communicate between the client and server you need to send packets (obviously), so I did some tests which Collection (or array) I could use best to handle them, HashMap, ArrayList and a PacketHandler array. And the outcome is very unexpected to me, because the ArrayList wins.

The packet handling structure is just like dictionary usage (index to PacketHandler), and because an array is the most primitive form of dictionary use I thought that would easily perform better than an ArrayList. Could someone explain me why this is?

My test

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;

public class Main {
  /**
   * Packet handler interface.
   */
  private interface PacketHandler {
    void handle();
  }

  /**
   * A dummy packet handler.
   */
  private class DummyPacketHandler implements PacketHandler {
    @Override
    public void handle() { }
  }

  public Main() {
    Random r = new Random();
    PacketHandler[] handlers = new PacketHandler[256];
    HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
    ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();

    // packet handler initialization
    for (int i = 0; i < 255; i++) {
      DummyPacketHandler p = new DummyPacketHandler();
      handlers[i] = p;
      m.put(new Integer(i), p);
      list.add(p);
    }

    // array
    long time = System.currentTimeMillis();
    for (int i = 0; i < 10000000; i++)
      handlers[r.nextInt(255)].handle();
    System.out.println((System.currentTimeMillis() - time));

    // hashmap
    time = System.currentTimeMillis();
    for (int i = 0; i < 10000000; i++)
      m.get(new Integer(r.nextInt(255))).handle();
    System.out.println((System.currentTimeMillis() - time));

    // arraylist
    time = System.currentTimeMillis();
    for (int i = 0; i < 10000000; i++)
      list.get(r.nextInt(255)).handle();
    System.out.println((System.currentTimeMillis() - time));
  }

  public static void main(String[] args) {
    new Main();
  }
}

I think the problem is quite solved, thanks everybody

Jon
  • 457
  • 1
  • 5
  • 11

2 Answers2

4

The shorter answer is that ArrayList is slightly more optimised the first time, but is still slower in the long run.

How and when the JVM optimise the code before its completely warmed up isn't always obvious and can change between version and based on your command line options.

What is really interesting is what you get when you repeat the test. The reason that makes a difference here is that the code is compiled in stages in the background as you want to have tests where the code is already as fast as it will get right from the start.


There are a few things you can do to make your benchmark more reproducaeable.

  • generate your random numbers in advance, they are not part of your test but can slow you down.
  • place each loop in a separate method. The first loop triggers the whole method to be compiled for better or worse.
  • repeat the test 5 to 10 times and ignore the first one.
  • Using System.nanoTime() instead of currentTimeMillis() It may not make any difference here but is a good habit to get into.
  • Use autoboxing where you can so it uses the integer cache or Integer.valueOf(n) which does the same thing. new Integer(n) will always create an object.
  • make sure your inner loop does something otherwise its quite likely the JIT will optimise it away to nothing. ;)

.

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;

public class Main {

    /**
     * Packet handler interface.
     */
    private interface PacketHandler {
        void handle();
    }

    /**
     * A dummy packet handler.
     */
    static class DummyPacketHandler implements PacketHandler {

        @Override
        public void handle() {
        }

    }

    public static void main(String[] args) {
        Random r = new Random();

        PacketHandler[] handlers = new PacketHandler[256];
        HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
        ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();

        // packet handler initialization
        for (int i = 0; i < 256; i++) {
            DummyPacketHandler p = new DummyPacketHandler();
            handlers[i] = p;
            m.put(new Integer(i), p);
            list.add(p);
        }

        int runs = 10000000;
        int[] handlerToUse = new int[runs];
        for (int i = 0; i < runs; i++)
            handlerToUse[i] = r.nextInt(256);

        for (int i = 0; i < 5; i++) {
            testArray(handlers, runs, handlerToUse);
            testHashMap(m, runs, handlerToUse);
            testArrayList(list, runs, handlerToUse);
            System.out.println();
        }
    }

    private static void testArray(PacketHandler[] handlers, int runs, int[] handlerToUse) {
        // array
        long time = System.nanoTime();
        for (int i = 0; i < runs; i++)
            handlers[handlerToUse[i]].handle();
        System.out.print((System.nanoTime() - time)/1e6+" ");
    }

    private static void testHashMap(HashMap<Integer, PacketHandler> m, int runs, int[] handlerToUse) {
        // hashmap
        long time = System.nanoTime();
        for (int i = 0; i < runs; i++)
            m.get(handlerToUse[i]).handle();
        System.out.print((System.nanoTime() - time)/1e6+" ");
    }

    private static void testArrayList(ArrayList<PacketHandler> list, int runs, int[] handlerToUse) {
        // arraylist
        long time = System.nanoTime();
        for (int i = 0; i < runs; i++)
            list.get(handlerToUse[i]).handle();
        System.out.print((System.nanoTime() - time)/1e6+" ");
    }
}

prints for array HashMap ArrayList

24.62537 263.185092 24.19565 
28.997305 206.956117 23.437585 
19.422327 224.894738 21.191718 
14.154433 194.014725 16.927638 
13.897081 163.383876 16.678818 

After the code warms up, the array is marginally faster.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • - What do you mean with 'in advance', like storing them in an array? And I don't really think nanoTime() will be better, since it iterates one million times through each one, but thanks for the tip – Jon Sep 13 '12 at 17:59
  • Not much better here but as you can see from my example, its getting down to a few milli-seconds. – Peter Lawrey Sep 13 '12 at 18:04
  • @Jon, nanoTime is better because the high precision clock is much more accurate (not just more precise). That is, the curreTimeMillis method can give you a value which has drifted around. – Tim Bender Sep 13 '12 at 18:05
  • @TimBender OK, if you want me to be blunt... "Could someone explain me why this is? " because the test is broken, in these ways and if you fix the test you get the expected result. – Peter Lawrey Sep 13 '12 at 18:05
  • 1
    The shorter answer is that ArrayList is slightly more optimised the first time, but is still slower in the long run. – Peter Lawrey Sep 13 '12 at 18:06
  • 1
    @PeterLawrey, Sorry, my comment was before you updated the test with fixes and ran it. I think saying something like your last comment is a really good answer. – Tim Bender Sep 13 '12 at 18:11
  • @TimBender I was trying to be diplomatic but not made the answer clear. – Peter Lawrey Sep 13 '12 at 18:17
  • @Peter Lawrey, Tim Bender, didn't know that, I only thought it was for precise but didn't think it would harm to benchmark, - and also I didn't know the startup took so long, because I thought the JVM initialized everything in the code before the program actually executes – Jon Sep 13 '12 at 18:19
  • Its can take at least 10,000 iteration and hundreds of milli-seconds to warmup even a simple loop. :P – Peter Lawrey Sep 13 '12 at 18:21
2

There are at least a few problems with your benchmark:

  • you run your tests directly in main, meaning that when you main method gets compiled, the JIT compiler has not had time to optimise all the code because it has not run it yet
  • the map method creates a new integer each time, which is not fair: use m.get(r.nextInt(255)).handle(); to allow the Integer cache to be used
  • you need to run your test several times before you can draw conclusions
  • you are not using the result of what you do in your loops and the JIT is therefore allowed to simply ignore them
  • monitor GC as it might always run at the same time and bias the results of one of your loop and add a System.gc() call between each loop.

But before doing all that, read this post ;-)

After tweaking your code a bit, I get these results:

Array: 116
Map: 139
List: 117

So array and list are close to identical once compiled and map is slightly slower.

Code:

public class Main {

/**
 * Packet handler interface.
 */
private interface PacketHandler {

    int handle();
}

/**
 * A dummy packet handler.
 */
private class DummyPacketHandler implements PacketHandler {

    @Override
    public int handle() {
        return 123;
    }
}

public Main() {
    Random r = new Random();

    PacketHandler[] handlers = new PacketHandler[256];
    HashMap<Integer, PacketHandler> m = new HashMap<Integer, PacketHandler>();
    ArrayList<PacketHandler> list = new ArrayList<PacketHandler>();

    // packet handler initialization
    for (int i = 0; i < 255; i++) {
        DummyPacketHandler p = new DummyPacketHandler();
        handlers[i] = p;
        m.put(new Integer(i), p);
        list.add(p);
    }

    long sum = 0;
    runArray(handlers, r, 20000);
    runMap(m, r, 20000);
    runList(list, r, 20000);

    // array
    long time = System.nanoTime();
    sum += runArray(handlers, r, 10000000);
    System.out.println("Array: " + (System.nanoTime() - time) / 1000000);

    // hashmap
    time = System.nanoTime();
    sum += runMap(m, r, 10000000);
    System.out.println("Map: " + (System.nanoTime() - time) / 1000000);

    // arraylist
    time = System.nanoTime();
    sum += runList(list, r, 10000000);
    System.out.println("List: " + (System.nanoTime() - time) / 1000000);

    System.out.println(sum);
}

public static void main(String[] args) {
    new Main();
}


private long runArray(PacketHandler[] handlers, Random r, int loops) {
    long sum = 0;
    for (int i = 0; i < loops; i++)
        sum += handlers[r.nextInt(255)].handle();
    return sum;
}

private long runMap(HashMap<Integer, PacketHandler> m, Random r, int loops) {
    long sum = 0;
    for (int i = 0; i < loops; i++)
        sum += m.get(new Integer(r.nextInt(255))).handle();
    return sum;
}

private long runList(List<PacketHandler> list, Random r, int loops) {
    long sum = 0;
    for (int i = 0; i < loops; i++)
        sum += list.get(r.nextInt(255)).handle();
    return sum;
}

}

Community
  • 1
  • 1
assylias
  • 321,522
  • 82
  • 660
  • 783
  • Yes, I understand that the random integer generating is not 'fair', but I thought I would not really matter since I generate them at every Collection/Array – Jon Sep 13 '12 at 17:57
  • @Jon I'm talking about using `new Integer(random)` only for the map => that creates a new object every time. – assylias Sep 13 '12 at 17:58
  • @Jon corrected my post accordingly - I now understand why it was not clear. – assylias Sep 13 '12 at 18:01
  • Ahh, I see. Yes, you're right, but that was actually part of the test; I want to see which one is faster, and to do this with HashMap I have to create that Integer object. Or do you recommend something different? – Jon Sep 13 '12 at 18:13
  • @Jon no you don't - just pass in an int and the autoboxing will do the rest. – assylias Sep 13 '12 at 19:36
  • Didn't know that, I didn't expect that you could use primitives where a wrapper class is expected. – Jon Sep 13 '12 at 21:02