HashMap or TreeMap or LinkedHashMap which one is fastest to iterate over?

Question

I have a Map which is filled up during the start up of application. It doesn't change later during the execution of application. Later this map is only used to iterate all the elements in it. Which concrete implementation of Map should I choose? HashMap or TreeMap or LinkedHashMap ?
UPDATE
Insertion order doesn't matter. The only thing that matters is fast iteration of all elements (say 6000 elements).

Insertion order doesn't matter. The only thing that matters is fast iteration of all elements (say 6000 elements). — Mac, Jul 28 '13 at 16:48
Consider editing your post instead of adding a comment to it, that's easy to miss. — Marconius, Jul 28 '13 at 17:00
Please post your [profile](http://stackoverflow.com/q/2064427/230513) results. — trashgod, Jul 28 '13 at 17:02

score 13 · Answer 1 · edited Oct 05 '17 at 18:35

13

HashMap will generally be fastest, since it has the best cache behavior (HashMap iterates directly over the backing array, whereas TreeMap and LinkedHashMap iterate over linked data structures).

You may want to use an ImmutableMap or UnmodifiableMap if the map isn't going to change once it's initialized

edited Oct 05 '17 at 18:35

patstuart

1,931
1
19
29

answered Jul 28 '13 at 16:49

Zim-Zam O'Pootertoot

17,888
4
41
69

5

The story is not that simple. You should add the following quote from the HashMap Javadoc: "Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important." – Marko Topolnik Jul 28 '13 at 17:35
@MarkoTopolnik This Last line that you have posted from javadoc is what making me to give it a thought to reconsider my decision to use `HashMap` blindly.!! What is your suggestion on this scenario? – Mac Jul 28 '13 at 17:39

Marko Topolnik · Accepted Answer · 2023-04-27T12:42:07.473

10

None of the other answers here take into consideration the effects of the CPU cache, which can be huge when iteration is concerned.

One way to improve this is to use just a single array of interleaved keys and values (keys at even indices, values at odd ones). This will tightly group together these data items and maximally leverage the cache, at least for the references.

But the true, screaming improvement would be achieved if you could avoid creating objects which hold your data and use just arrays of primitive values. This is, naturally, highly dependent on your use case.

edited Apr 27 '23 at 12:42

answered Jul 28 '13 at 17:43

Marko Topolnik

195,646
29
319
436

In my case I am storing an `ID` which is integer as key and an `object` of some user defined data type as the value. So, I don't think that I can use a single array for this purpose as the data type for `key` and `value` is different. – Mac Jul 28 '13 at 17:46
2

If you don't know the value type, then there's little you can do but optimize the access to the key and value references. It *may* pay off to have a separate primitive `int[]` for the keys and another one for the values. If you really want the last inch of performance, you'll have to benchmark these alternatives. – Marko Topolnik Jul 28 '13 at 17:54
Thanks for the different and innovative alternative.I would surely love to give it a try. But some doubt of mine is still not cleared about the Iteration power of these three `Map`s :( . I know the `value` type as I have myself designed that class whose object is going to become the `value` of map. Which of the three `Maps` will have fastest iteration in my case is still not clearly hardwired in my mind :( – Mac Jul 28 '13 at 18:00
Nobody can give the ultimate answer: you must benchmark it. And when you do, use a microbenchmark tool like Google Caliper or Oracle `jmh`. Otherwise you will almost certainly obtain inappropriate results. Yes, performance is a tough subject and requires a great amount of knowledge, experience, and intuition. There's just no royal path to it. – Marko Topolnik Jul 28 '13 at 18:07
Thanks. I am for now considering your approach and would go through those microbenchmark (I don't know how to use it . So any citation from you or any link would be much helpful). – Mac Jul 28 '13 at 18:22
It's easy: just google for Google Caliper. [This will be the first link](https://code.google.com/p/caliper). Similar for [OpenJDK jmh](http://openjdk.java.net/projects/code-tools/jmh). Both are Maven dependencies which you just add to your project and follow a few simple steps. Caliper is slightly easier to use – Marko Topolnik Jul 28 '13 at 18:28
And regarding the CPU cache by `Iterator`. By this do you want to convey that Iterator internally caches the `key-value` pairs in CPU registers before iterating it? Or, I got it wrong? – Mac Jul 28 '13 at 18:31
1

I wouldn't bother with an Iterator at all, there's just no need for that. The point is that as you iterate, you access successive RAM addresses instead of sprinkling your requests all over. All CPU cache architechures are optimized with the assumption that the next access will be just ahead of the current access. – Marko Topolnik Jul 28 '13 at 18:33
can you please put some light on this question http://stackoverflow.com/q/24591338/2536255 – Mac Jul 05 '14 at 23:39

OldCurmudgeon · Answer 3 · 2013-07-28T18:16:55.747

8

I wouldn't use the map. If all you want is to iterate through the entries make a new ArrayList of what you need and use that - you cannot get faster than an ArrayList for iteration.

// Which map you use only chooses the order of the list.
Map<Key,Value> map = new HashMap<>();
// The list to iterate for maximum speed.
List<Map.Entry<Key,Value>> list = new ArrayList<>(map.entrySet());

This way you iterate across the entry set just once to build the list. From then on you are iterating across the list again and again - which certainly should be close to optimal.

Note Changed from LinkedList to ArrayList on Marko's suggestion.

edited Jul 28 '13 at 18:16

answered Jul 28 '13 at 18:05

OldCurmudgeon

64,482
16
119
213

1

Array iteration is certainly much faster due to a very, very much denser packing of the elements. – Marko Topolnik Jul 28 '13 at 18:09
Good point Marko - so if an observed speedup can be optained from an `ArrayList` then use that but you will be taking more memory. A `LinkedList` will add minimal memory overhead. – OldCurmudgeon Jul 28 '13 at 18:11
1

No, a `LinkedList` is actually the most wasteful: it allocates a full object for each list node. An `ArrayList` just has an array and, naturally, you can tailor its size just right if you are converting from an existing map. But my suggestion would be a plain array instead of an `ArrayList`. For a fixed list you really don't gain anything from `ArrayList`. – Marko Topolnik Jul 28 '13 at 18:13
1

@MarkoTopolnik - You are right - I realized that just after I posted. I'll change my post. Sometimes I still think in C. – OldCurmudgeon Jul 28 '13 at 18:15

score 3 · Answer 4 · answered Jul 28 '13 at 16:53

3

If your map is used only to iterate over elements and the speed of iteration is important, then convert the map into an ArrayList and iterate over it, this will be the fastest way.

answered Jul 28 '13 at 16:53

Evgeniy Dorofeev

133,369
30
199
275

How can I convert `HashMap` into `ArrayList`? How an `Arraylist` is going to store the `key-value` pairs of an `Hashmap`? – Mac Jul 28 '13 at 16:58
@Mac traverse the `entrySet()` of the `Map` and copy the keys into a array of keys and the values into an array of values, as suggested in my answer. – Óscar López Jul 28 '13 at 16:59
@Mac - easy. create a class to hold both your key and value, then create an ArrayList of that class ... – radai Jul 28 '13 at 17:00
@radai that class already exists, it's the `Map.Entry` used internally in a `Map` – Óscar López Jul 28 '13 at 17:10
1

@ÓscarLópez - using Map.Entry in any context that isnt a Map (like if he switches from map to list) is just plain ugly. – radai Jul 28 '13 at 17:23
@radai But that would lead to the creation of 6000 extra objects in memory!! Which I think is not appreciative.. – Mac Jul 28 '13 at 17:40
@Mac - not if you get rid of the map it wont. – radai Jul 28 '13 at 17:57

score 1 · Answer 5 · edited May 23 '17 at 12:02

1

I agree with Zim-Zam's answer, and add something: if you need to access both the keys and the values during the iteration, the fastest way is using the entrySet() method, as discussed here.

Now, if you're sure that the map is never going to change, why not use a separate data structure for iteration? for example: iterate once over the finished map and fill with its contents a couple of arrays, one with the keys and the other with the values, in the same corresponding positions. Then traversing those arrays will be as fast as it can get.

edited May 23 '17 at 12:02

Community

1
1

answered Jul 28 '13 at 16:51

Óscar López

232,561
37
312
386

Does it mean that `Iterator` obtained for set obtained by `entrySet()` method of `Hashmap` will be faster than that of `TreeMap`? I know that `insertion` and `searching` in `Hashmap` is faster than `Treemap` but I want to know the performance of `Iterator` for both of them – Mac Jul 28 '13 at 17:23
@Mac No, it means that traversing a map will be faster if using the set returned by `entrySet()`. A different matter is the fact that traversing a `HashMap` might be faster than traversing a `TreeMap` – Óscar López Jul 28 '13 at 17:25
The `Iterator` obtained from the `entrySet()` of a `HashMap` might be faster, because the way it's implemented internally. – Óscar López Jul 28 '13 at 17:28
You have used *might be*. But , Can I be assured that `HashMap` performance will be faster ? – Mac Jul 28 '13 at 17:30
1

You should run a profiler and see for yourself, it's the only way to be sure. I'm estimating the relative speeds based on my knowledge of the internals of the classes, but many factors come into play - don't take my word for it, perform some measurements. – Óscar López Jul 28 '13 at 17:31

score 1 · Answer 6 · answered Mar 26 '19 at 18:54

As of Java 10 you can use one of the Map.of() overloaded methods or Map.copyOf() to create an unmodifiable Map. As the map returned by these methods doesn't support putting any new entries into the map, the existing keys/values are stored in an interleaving pattern as described in this answer. This should offer the best performance for your use case.

hamom1968 · Answer 7 · 2017-05-02T23:31:14.830

The Map.forEach(BiConsumer) is almost always faster than Iterators, sometimes by a large margin. If you were summing the values for example:

public class TestForStackOverflow {
    int sum = 0;

    // Or whatever Map you are using
    Map<Object, Integer> map = new HashMap();

    static class C implements BiConsumer<Object, Integer> {
        int sum = 0;

        @Override
        public void accept(Object k, Integer v) {
            sum += v;
        }
    }

    public int getSum(Map map) {
        C c = new C();
        map.forEach(c);
        return c.sum;
    }

    public int getSum2(Map map) {
        map.forEach((k, v) -> sum += v);
        return sum;
    }
}

HashMap or TreeMap or LinkedHashMap which one is fastest to iterate over?

7 Answers7