2

I have a problem with Sets/Iterators in Java. I'm executing code that iterates over elements in a hash set, removing them after they have been used but also removing elements that have been deemed unnecessary to iterate over within the loop. I'm also adding elements to the loop. Here's a code sample:

Set<Integer> thisSet = new HashSet<Integer>();
// add elements into set
while (!thisSet.isEmpty()) {
   int value = thisSet.iterator().next();
   thisSet.remove(value);
   // more remove and add operations
}

I chose hash sets because I thought that the remove operations during the loop would be a lot faster than when using a list. The problem is that statistics show me that if the set gets large, polling a value from the set actually takes up a lot of time (I assume due to creating an iterator every time?). Does anyone have suggestions on how to improve this?

Thanks!

tbodt
  • 16,609
  • 6
  • 58
  • 83
Anja
  • 21
  • 2

2 Answers2

0

You can find an articles about collections performance measurements: http://www.artima.com/weblogs/viewpost.jsp?thread=122295 (The Final Performance Testing Example by Bruce Eckel)

or http://java.dzone.com/articles/java-collection-performance (Java Collection Performance)

Maybe it will help you to choose the right collection for your implementation.

andreih
  • 31
  • 3
0

Don't have enough reputation to add a comment to Hovercraft Full Of Eels, but he's absolutely right. Removing an item from a set without using an iterator will eventually lead to a ConcurrentModificationException one way or another. +1

If we're looking at this from perspectives of datastructures, you need a datastructure backed up by a Linked List to make insertion/deletion faster. See Linked lists vs. dynamic arrays @ https://en.wikipedia.org/wiki/Linked_list for insertion/deletion O() bounds.

Ultimately, my suggestion is to backup your Set by a LinkedHashSet object and use a profiler to examine execution times for the sample inputs you're going to use.

Evgheni Crujcov
  • 470
  • 2
  • 8