2

I 'm novice in Algorithm. I read and aware that big-O of put(K key, V value) in Hashmap is O(1). When I went to core of HashMap class

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        //...
    if ((p = tab[i = (n - 1) & hash]) == null)
        //...
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            // ...
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // ...
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
           // ...
        }
    }
    ...
    return null;
}

As you can see, when adding new item to hashmap, it will iterate max n (all item in hashmap) with "For Loop" above:

 for (int binCount = 0; ; ++binCount) {

Now, big-O of For Loop here is O(n) --> Why big-O of put(K key, V value) in HashMap can be O(1) ? Where do I understand wrongly ?

Thanks very much.

3 Answers3

3

The HashMap is actually a collection (backed by an array) of buckets that are backed by a Red-Black tree (as of Java 8). If you have a very poor hashing function that puts all the elements into the same bin, then performance would degrade to O(log(n))

From Baeldung,

HashMap has O(1) complexity, or constant-time complexity, of putting and getting the elements. Of course, lots of collisions could degrade the performance to O(log(n)) time complexity in the worst case, when all elements land in a single bucket. This is usually solved by providing a good hash function with a uniform distribution.

From the docs,

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.

Bill
  • 1,407
  • 10
  • 18
2

The idea is that a single bin of the hash table has an expected constant number of elements. Therefore the loop you mentioned would run in O(1) expected time (assuming the hashCode() of your keys is not terrible).

Specifically, for the HashMap implementation, there's a load factor whose value is 0.75 by default. This means that on average, each bin of the HashMap should have <= 0.75 elements (once there are more than load-factor * number-of-bins entries in the HashMap, the number-of-bins is doubled in order to maintain this invariant). Therefore, the mentioned loop should have a single iteration on average.

Eran
  • 387,369
  • 54
  • 702
  • 768
0

Big-O notation has to do with the performance of a function relative to the number of elements is operates on. Just having a for loop doesn't suddenly make a Hashmap lookup's performance grow by a fixed amount for every element within the Hashmap.

There are patterns in Big-O notation. Loops across the entire set of elements is O(n) but loops in general don't mean the lookup is O(n). To illustrate, I'll use the (silly) example below

Function with O(1) performance

  public void add_o_one(int x, int y) {
     return x + y;
  }

Function with O(1) performance, with loops added

  public void add_still_o_one(int x, int y) {
     int[2] numbers;
     numbers[0] = x; 
     numbers[1] = y;
     int result = 0;
     for (int index = 0; index < numbers.length; index++) {
        result += numbers[index];
     }
     return result;
  }

While I would expect the latter one to be a bit less efficient, there's no way to alter its run-time speed by choosing different (or more) numbers. Therefore, it is still O(1).

In your Hashmap, the looping over bucket lists does alter the speed relative to the input; but, it doesn't alter it by a constant amount for each element. Also, most of the hashmap implementations grow their bucket size as the map fills. This means that what you are seeing is not a common case, and is likely to seldom be called (unless you implement a really bad hashcode).

While you might consider that "something larger than O(1)" it is very difficult to make the code function in a way that is inconsistent with O(1) unless you specifically break the algorithm (by providing objects that all hash to the same value, for example).

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138