37

Please refer to the following method :

public Set<LIMSGridCell> getCellsInColumn(String columnIndex){
    Map<String,LIMSGridCell> cellsMap = getCellsMap();
    Set<LIMSGridCell> cells = new HashSet<LIMSGridCell>();
    Set<String> keySet = cellsMap.keySet();
    for(String key: keySet){
      if(key.startsWith(columnIndex)){
        cells.add(cellsMap.get(key));
      }
    }
    return cells;
  }

FindBugs give this waring message :

"Inefficient use of keySet iterator instead of entrySet iterator This method accesses the value of a Map entry, using a key that was retrieved from a keySet iterator. It is more efficient to use an iterator on the entrySet of the map, to avoid the Map.get(key) lookup."

TryTryAgain
  • 7,632
  • 11
  • 46
  • 82
Geek
  • 26,489
  • 43
  • 149
  • 227
  • 1
    If the `Map` is a hash map it is debatable whether it is measurably more efficient, as the lookup is *O(1),* and otherwise it must be a `TreeMap` where the lookup is *(O log N).* It's hardly going to make much of a difference. Pure nitpicking here. – user207421 Jun 08 '21 at 04:24

5 Answers5

59

You are retrieving all the keys (accessing the whole map) and then for some keys, you access the map again to get the value.

You can iterate over the map to get map entries (Map.Entry) (couples of keys and values) and access the map only once.

Map.entrySet() delivers a set of Map.Entrys each one with the key and corresponding value.

for ( Map.Entry< String, LIMSGridCell > entry : cellsMap.entrySet() ) {
    if ( entry.getKey().startsWith( columnIndex ) ) {
        cells.add( entry.getValue() );
    }
}

Note: I doubt that this will be much of an improvement since if you use map entries you will instantiate an object for each entry. I don't know if this is really faster than calling get() and retrieving the needed reference directly.

Community
  • 1
  • 1
Matteo
  • 14,696
  • 9
  • 68
  • 106
  • 4
    but isn't get() on hashMap O(1) ? – Geek Sep 28 '12 at 11:52
  • 5
    @Geek: yes. See my added note. I doubt that FindBugs suggestion really makes sense. Both instantiation and get() are O(1) – Matteo Sep 28 '12 at 11:54
  • 4
    The map may store Entry's (e.g. Sun's HashMap implementation) so no instantiation is required. And get() may be more than O(1), e.g. a TreeMap or a HashMap with a bad hash function. But you're right that in most cases it won't make a noticeable difference. – TimK Sep 28 '12 at 17:43
  • @Matteo Could you please review my answer? Please let me know if any comments. – Kanagavelu Sugumar Jun 26 '14 at 08:46
  • 1
    “*if you use map entries you will instantiate an object for each entry*”—Surely not. Most map implementations are already a map of entries. Most notably when iterating over a `HashMap`, the entry instances are identical to the internally stored entry objects. So calling `getValue` (and likewise `setValue`) on an `Entry` is accessing the value directly, whereas calling `get` on the map implies calling `hashCode` on the key, calculating an array index, and calling `equals` at least once on the key, to get to the same entry object you’d have already in the first place when using `entrySet()`. – Holger Jun 08 '21 at 14:57
  • @Matteo can you please look into my question https://stackoverflow.com/questions/67880418/getdetailsstring-makes-inefficient-use-of-keyset-iterator-instead-of-entryset – Gen Jun 15 '21 at 04:05
13

If somebody is still interested in a detailed and number-backed answer: yes, you should use entrySet() vs. keySet() in case you are iterating over the whole Map. See this Gist for detailed numbers. I run a benchmark with JMH for the default implementations of Map with Oracle JDK8.

The main finding is: it is always a bit slower to iterate over the keySet and re-query for every key. As soon as you have bigger maps, the multiplier can become quite big (e.g., for a ConcurrentSkipListMap it is always 5-10x; while for HashMaps it is not bigger than 2x for up to a million entries).

However, these are still very small numbers. The slowest way to iterate over 1 million entries is with a ConcurrentSkipListMap.keySet(), which is around 500-700 milliseconds; while iterating over over IdentityHashMap.entrySet() is just 25-30 milliseconds with LinkedHashMap.entrySet() just behind with 40-50 milliseconds (not surprising, as it has a LinkedList inside it, which helps with iteration). As an overview from the above linked Gist:

Map type              | Access Type | Δ for 1M entries
----------------------+-------------+-----------------
HashMap               | .entrySet() |     69-72  ms
HashMap               |   .keySet() |     86-94  ms
ConcurrentHashMap     | .entrySet() |     72-76  ms
ConcurrentHashMap     |   .keySet() |     87-95  ms
TreeMap               | .entrySet() |    101-105 ms
TreeMap               |   .keySet() |    257-279 ms
LinkedHashMap         | .entrySet() |     37-49  ms
LinkedHashMap         |   .keySet() |     89-120 ms
ConcurrentSkipListMap | .entrySet() |     94-108 ms
ConcurrentSkipListMap |   .keySet() |    494-696 ms
IdentityHashMap       | .entrySet() |     26-29  ms
IdentityHashMap       |   .keySet() |     69-77  ms

So the bottom line is: it depends on your use-case. While it is definitely faster to iterate over the entrySet() the numbers are not huge, especially for reasonably small Maps. However, if you are iterating over a Map with 1 million entries quite regularly, better use the faster way ;)

The numbers of course are just to compare with each other, not absolutes.

D. Kovács
  • 1,232
  • 13
  • 25
10

You're getting the set of keys in the map, then using each key to get the value out of the map.

Instead, you can simply iterate through the Map.Entry key/value pairs returned to you via entrySet(). That way you avoid the relatively expensive get() lookup (note the use of the word relatively here)

e.g.

for (Map.Entry<String,LIMSGridCell> e : map.entrySet()) {
   // do something with...
   e.getKey();
   e.getValue();
}
Brian Agnew
  • 268,207
  • 37
  • 334
  • 440
  • In this case the map implementation is HashMap . Isn't get() for a HashMap O(1) ? – Geek Sep 28 '12 at 11:46
  • @Geek: it is, but with using entrySet() you completely remove the call to `get()` –  Sep 28 '12 at 11:58
  • 3
    O(1) doesn't specify how long it will take, merely that it's constant – Brian Agnew Sep 28 '12 at 11:59
  • 5
    But he doesn't access _each_ value by get(). Only those, whose keys match the condition are used. I think there's no general rule which way to prefer. It depends on the fraction of keys matching the condition. Obviously, FindBugs can't check that. – Heiko Schmitz Sep 28 '12 at 12:59
2

This is the suggestion; not really an answer for your question. When you are working with ConcurrentHashMap; below is the iterator behavior mentioned in javadoc

The view's iterator is a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.

So if you use EntrySet iterator; this may contain stale key/value pair; so it would be better; get the key from keySet iterator(); and check with collection for value. this will ensure you are getting the recent change from the collection.

If you are OK with fail-safe iterator; then check this link ; it states using entrySet; little improving the performance.

Kanagavelu Sugumar
  • 18,766
  • 20
  • 94
  • 101
0

In keyset, you need to get all the keys and then you search for every key in the collection.

Moreover, loop over the entrySet is faster, because you don't query the map twice for each key.

If you need only keys or only values of your Map, then use rather keySet() or values().

Chetan Laddha
  • 993
  • 8
  • 22