0

Which way would be the fastest to add an element to a set?

The way I see it you could do this in 2 different ways:

  • Check if the element you want to add is allready in the set and if it is not in the set, add it.

  • Just add it, because a set has unique elements. I am not sure this way will not throw an error.

Sander
  • 142
  • 2
  • 15

3 Answers3

5

Just add it directly. It have mechanism to check.

From docs of add method.

Adds the specified element to this set if it is not already present. More formally, adds the specified element e to this set if this set contains no element e2 such that (e==null ? e2==null : e.equals(e2)). If this set already contains the element, the call leaves the set unchanged and returns false.

Suresh Atta
  • 120,458
  • 37
  • 198
  • 307
2

I did a little work to check that out for you

Random rng = new Random();
ArrayList<Integer> numbers = new ArrayList<>();
for (int i = 0; i < 100000; ++i)
    numbers.add(rng.nextInt());
    
HashSet<Integer> set = new HashSet<>();
long beginNoCheck = System.nanoTime();
for (int i : numbers)
{
    set.add(i);
}
long endNoCheck = System.nanoTime();
   
set = new HashSet<>();
long beginCheck = System.nanoTime();
for (int i : numbers)
{
    if (!set.contains(i))
        set.add(i);
}
long endCheck = System.nanoTime();
    
System.out.println("Without check: " + (endNoCheck - beginNoCheck));
System.out.println("With check: " + (endCheck - beginCheck));

And it seems, that checking if HashSet contains an element before adding it is a bit faster. I am getting results like

Without check: 66060748
With check: 46209808

Without check: 38249163
With check: 32058546

Without check: 58362677
With check: 34429848

Without check: 52095512
With check: 39612073

Without check: 34892990
With check: 28945278

Without check: 42090287
With check: 38426209

Community
  • 1
  • 1
Zereges
  • 5,139
  • 1
  • 25
  • 49
  • Thank you for your elaborate answer. I didn't know how to do these test myself so I am very greatfull that you did them for me. – Sander Apr 03 '16 at 08:51
  • 1
    It can't be faster. Just think about it: we are adding manual `contains` check which is totally independent of the `add` operation. It can only make it run slower. Your test design is wrong, try to run these tests independently. – Aleksandr Erokhin Apr 03 '16 at 08:56
  • @Zereges I think your test suffers from some kind of JVM warmup effect. Please see this thread about correct micro benchmark design in java: http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java – Aleksandr Erokhin Apr 03 '16 at 09:15
1

You can also always check the implementations to see what's going on behind the curtain:

HashSet is just a HashMap

http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#HashMap.getEntry%28java.lang.Object%29

Tests like the one posted below might have different results on different machines or even just in different situations. If there is no really huge difference between runtimes, readability is usually a lot better than a few milliseconds. Usually a simple proper concept and design of the whole software outspeeds convoluted tricks by far.

JayC667
  • 2,418
  • 2
  • 17
  • 31