13

The following code produces the out put [1,2]even though hashset is not sorted.

Set set = new HashSet();
set.add(new Integer(2));
set.add(new Integer(1));
System.out.println(set);

Why is that?

arshajii
  • 127,459
  • 24
  • 238
  • 287
Ashwin
  • 12,691
  • 31
  • 118
  • 190

5 Answers5

17

EDIT: As of Java 8 and later, the following is no longer applicable. This proves that you shouldn't rely on undocumented Java behaviours.


This behaviour is caused by several separate reasons:

  • Integers hash to themselves
  • in Java, HashMaps and HashSets are backed up by an array
  • they also modify hashes using the higher bits to modify the lower bits; if the hash is in range 0..15, it's therefore not modified
  • what bucket an object goes depends on the lower bits of the modified hash
  • when iterating over a map or a set, the inner table is scanned sequentially

So if you add several small (<16) integers to a hashmap/hashset, this is what happens:

  • integer i has hashcode i
  • since it's less than 16, it's modified hash is also i
  • it lands in the bucket no. i
  • when iterating, the buckets are visited sequentially, so if all you stored there are small integers, they will be retrieved in ascending order

Note that if the initial number of buckets is too small, the integers may land in buckets not numbered after them:

HashSet<Integer> set = new HashSet<>(4);
set.add(5); set.add(3); set.add(1);
for(int i : set) {
  System.out.print(i);
}

prints 153.

Karol S
  • 9,028
  • 2
  • 32
  • 45
9

A HashSet as per the documentation does not guarantee any concept of order, so what you're seeing could very well change in a future update of Java.

However, if you're wondering why Java's (as of now) specific implementation of HashSet produces the result you're seeing: it's because the Integer of value 1 hashes to a location in the internal entry table of a HashMap that comes before the location to which 2 hashes (note that a HashSet is really backed by a HashMap with arbitrary values). This makes sense since the hash code of an Integer object is just its value.

In fact, you can see this even if you add even more numbers (within a certain range: the size of the entry table which is 16 by default):

Set<Integer> set = new HashSet<>();
set.add(2);
set.add(1);
set.add(4);
set.add(3);
set.add(0);
System.out.println(set);
[0, 1, 2, 3, 4]

Iteration over a HashSet takes place by iterating over the internal entry table, which means items earlier in the table come first.

arshajii
  • 127,459
  • 24
  • 238
  • 287
1

A HashSet is an unordered collection. It has no guarantees and no concepts of "ordering". See this answer for more details: What is the difference between Set and List?

You can consider using a TreeSet if you need ordered, sorted Set.

There is also a LinkedHashSet for an ordered Set that is not sorted.

Community
  • 1
  • 1
Kon
  • 10,702
  • 6
  • 41
  • 58
  • @superEb I actually just added that blurb to my response. Looks like we realized it at the same time! – Kon Sep 06 '13 at 01:43
0

A set in java is not supposed to be ordered list. Use ArrayList instead. Also check Java Collection API for further reference.

H-Patel
  • 98
  • 5
  • `LinkedHashSet` is usually a better choice if you want fast (constant-time) lookups but with predictable iteration order. Only use `ArrayList` if you're trying to save space, and can tolerate slow (linear-time) lookups. – C. K. Young Sep 06 '13 at 01:44
  • @Ashwin Note that you can get logarithm time O(log n) look up times for a sorted List by using the `Collections.binarySearch()` method. – Kon Sep 06 '13 at 01:45
0

Got a conclusion after several trial and error. This is purely based on hashing order. And that is why we call HASH set. Hashing is based on length of the input. Try 4 single digit numbers with 4 double digit numbers. You will get result.

raghul
  • 1