5

I was trying to add Integers from 2 sets into single Set via for loop and also using addAll() method provided by Collections. For test purpose, I have populated 2 Sets with Integers and then tried adding them to third set

        Set<Integer>  undId = new HashSet<Integer>();
        Set<Integer>  proxies = new HashSet<Integer>();
        //Create 2 sets with Integers
        for(int i=0;i<100;i++){
            undId.add(i);
            proxies.add(i);         
        }

and method 1 : //Now add them to third set using for loop

            for(Integer integer : undId)
            underlyings.add(integer);

        for(Integer integer :proxies)
            underlyings.add(integer);

and method 2 ://Or add them to third set using addAll()

            underlyings.addAll(undId);
        underlyings.addAll(proxies);    

Now when i was trying to time the operation using System.nanoTime(), add is twice faster (for 100,1000,10000 elements). When i increased size to 1000000 or 10000000. It was reversed. I was wondering why would it happen for larger set. I am not sure how addAll() internally handles however any help in understanding above will be appreciated. Thnx

Tim B
  • 40,716
  • 16
  • 83
  • 128
Niche
  • 142
  • 1
  • 2
  • 9
  • As far as I remember, almost all of the collections do it exactly the way you do it: just loop and call `add`. So I'd say your benchmark is flawed. – Thomas Jungblut Jun 13 '14 at 11:17
  • Did you disable the JIT or run this several times without restarting the JVM? Otherwise JIT will corrupt your timings. – brummfondel Jun 13 '14 at 11:18
  • I ran it several time. O/p was more or less similar. I will try to find a better way to benchmark. – Niche Jun 13 '14 at 11:23
  • 1
    Use [JMH](http://openjdk.java.net/projects/code-tools/jmh) or [Caliper](http://code.google.com/p/caliper) if you want any solid results. Otherwise you get just garbage. – maaartinus Jun 13 '14 at 11:30

3 Answers3

5

Before you doing anything make sure you've read and understood the discussion here: Java benchmarking - why is the second loop faster?

I would expect addAll to be faster in some situations as it has more information to work with.

For example on an ArrayList addAll can make sure it allocates enough space to add every single element in one step, rather than having to reallocate multiple times if adding large numbers of elements.

It would certainly not be slower as even a naive implementation of it would just do what you do, loop through adding the items.

Community
  • 1
  • 1
Tim B
  • 40,716
  • 16
  • 83
  • 128
0

Check the implementation of the addAll method - nothing different that what you'd do - AbstractCollection code

Vinay Rao
  • 1,284
  • 9
  • 13
0

Concerning your question about how it is handled internally, just check the available source code, e.g. Ctrl-click in eclipse and find the implementation, which in your case is in AbstractCollection:

    public boolean addAll(Collection<? extends E> c) {
      boolean modified = false;
      for (E e : c)
        if (add(e))
            modified = true;
      return modified;
    }

(from JDK 1.7)

Alexander Rühl
  • 6,769
  • 9
  • 53
  • 96
  • 2
    This may hold true for the `Set` interface and implementations, but is not the case for lists. In the case of lists, optimizations exist to ensure that the array has enough space for all the elements (both old and new). Hence, the `addAll` operation could be faster than running `add` in a loop. – Timo Apr 03 '20 at 08:17