How does HashSet not allow duplicates?

Question

I was going through the add method of HashSet. It is mentioned that

If this set already contains the element, the call leaves the set unchanged and returns false.

But the add method is internally saving the values in HashMap

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

The put method of HashMap states that

Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

So if the put method of HashMap replaces the old value, how the HashSet add method leaves the set unchanged in case of duplicate elements?

Kudos for looking up the source code for `HashSet.add`. Did you look up the source for `HashMap.put`? — Ray Toal, Mar 06 '14 at 09:53

score 33 · Accepted Answer · answered Mar 06 '14 at 10:04

PRESENT is just a dummy value -- the set doesn't really care what it is. What the set does care about is the map's keys. So the logic goes like this:

Set.add(a):
  map.put(a, PRESENT) // so far, this is just what you said
    the key "a" is in the map, so...
      keep the "a" key, but map its value to the PRESENT we just passed in
      also, return the old value (which we'll call OLD)
  look at the return value: it's OLD, != null. So return false.

Now, the fact that OLD == PRESENT doesn't matter -- and note that Map.put doesn't change the key, just the value mapped to that key. Since the map's keys are what the Set really cares about, the Set is unchanged.

In fact, there has been some change to the underlying structures of the Set -- it replaced a mapping of (a, OLD) with (a, PRESENT). But that's not observable from outside the Set's implementation. (And as it happens, that change isn't even a real change, since OLD == PRESENT).

score 8 · Answer 2 · edited Apr 21 '20 at 13:53

The answer that you may be looking comes down to the fact that the backing hashmap maps the elements of the set to the value PRESENT which is defined in HashSet.java as follows:

private static final Object PRESENT = new Object();

In the source code for HashMap.put we have:

  386     public V put(K key, V value) {
  387         if (key == null)
  388             return putForNullKey(value);
  389         int hash = hash(key.hashCode());
  390         int i = indexFor(hash, table.length);
  391         for (Entry<K,V> e = table[i]; e != null; e = e.next) {
  392             Object k;
  393             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
  394                 V oldValue = e.value;
  395                 e.value = value;
  396                 e.recordAccess(this);
  397                 return oldValue;
  398             }
  399         }
  400 
  401         modCount++;
  402         addEntry(hash, key, value, i);
  403         return null;
  404     }

Because the key in question already exists, we will take the early return on line 397. But you might think a change is being made to the map on line 395, in which it appears that we are changing the value of a map entry. However, the value of value is PRESENT. But because PRESENT is static and final, so there is only one such instance; and so the assignment e.value = value actually doesn't change the map, and therefore the set, at all!

Update:

Once a HashSet is initialized.
- All the items in it are stored as keys in a HashMap
- All the values that HashMap have ONLY ONE object that is PRESENT which is a static field in HashSet

@RayToal, How you have copied the code along with line numbers?? — Zeeshan, Mar 07 '14 at 09:19
Copied from [here](http://hg.openjdk.java.net/jdk6/jdk6-gate/jdk/file/tip/src/share/classes/java/util/HashMap.java) — Ray Toal, Mar 07 '14 at 16:49

score 5 · Answer 3 · answered Mar 06 '14 at 09:54

5

As you can see the HashSet.add method adds the element to the HashMap.put as a key not as a value. Value is replaced in the HashMap not the key.

answered Mar 06 '14 at 09:54

shazin

21,379
3
54
71

3

Yes. All the values in the map are the same (PRESENT). The HashSet is using the inner map and its KEY portion to store the elements of the Set – EdH Mar 06 '14 at 09:56

score 3 · Answer 4 · answered Mar 06 '14 at 09:56

3

See HashMap#put:

Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

It replaces the key with the new value, this way, no duplicates will be in the HashSet.

answered Mar 06 '14 at 09:56

Maroun

94,125
30
188
241

2

I got this that no duplicated will be in the `HashSet`, I wan't to know why the old duplicate value is not getting overwritten in the `HashSet`? – Zeeshan Mar 06 '14 at 09:58
@Shaan, The map value may be replaced, (which is a constant static field in HashSet class and thus replaced with the same instance), and the map key is left untouched (which, in fact IS the Set collection item) – sutanu dalui Mar 06 '14 at 10:26

score 2 · Answer 5 · answered Mar 06 '14 at 09:56

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

e is the key, So if e is already present put will not return null. Hence add will return false.

JavaDoc for put :

the previous value associated with key, or null if there was no mapping for key. (A null return can also indicate that the map previously associated null with key.)

score 1 · Answer 6 · answered Mar 06 '14 at 10:24

From javadocs for HashMap.put(), "Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced."

Thus the map value will be replaced, (which is a constant static field in HashSet class, and thus the same instance is replaced), and the map key is kept untouched (which, in fact IS the Set collection item)

How does HashSet not allow duplicates?

6 Answers6

Linked