1

What's the best way to get a random element from a Collection? I've heard iteration in the best, so I've done the following:

    Collection<Integer> c = new HashSet<Integer>();
    Random r = new Random();
    for (int i = 0; i < 100000; i++){
        c.add(r.nextInt());
    }

    Iterator<Integer> i = c.iterator();
    int random = r.nextInt(c.size());
    int num = 0;
    int count = 1;
    while(i.hasNext()){
        num = i.next();
        if (count == random){
            break;
        }
        count++;
    }
    System.out.println(num);

It works fine, as far as I can tell and only takes a couple of milliseconds to complete. However, I've been told that the above is overcomplicating the problem. I know you can convert the collection to an array or in Java 8 you can use streams.

  • what are you trying to achieve with this code? – YoungHobbit Jan 06 '16 at 16:54
  • The above code is just a test to see what the best method is. – The Gaming Grunts Jan 06 '16 at 16:55
  • 3
    `Collections.shuffle` and get the first (if you don't care about the order of the collection afterward) –  Jan 06 '16 at 16:59
  • 4
    "I've heard iteration in the best" - that entirely depends on context. If you've got a collection with random access by index instead, just use the index. – Jon Skeet Jan 06 '16 at 16:59
  • @JonSkeet: In this case, the OP really doesn't have that luxury...it's just a blanket `Collection` instead of something more concrete. – Makoto Jan 06 '16 at 17:06
  • @Makoto: Yes, but it's not clear whether this is the OP's *actual* situation, given the earlier comments. – Jon Skeet Jan 06 '16 at 17:08
  • @RC. `Collections.shuffle` only works for lists, for which you could also use `get(random)`. – Clashsoft Jan 06 '16 at 17:09
  • @JonSkeet I'm just using numbers purely as a test. In reality, the collection would contain Player objects and the goal is just to get a random one from the collection. – The Gaming Grunts Jan 06 '16 at 17:12
  • It's not what the collection *contains* that's important, but the type of the collection. If it's a collection that allows random access by index, use that. If you can only iterate, do that. – Jon Skeet Jan 06 '16 at 17:22

3 Answers3

1

Abandon Collection; the interface isn't flexible enough to give you the ability to select an element by index.

Abandon HashSet; Sets in general don't support random indexing.

You'll want to use a List, and make use of Collections#shuffle to accomplish what you're interested in.

Makoto
  • 104,088
  • 27
  • 192
  • 230
0

Converting a Collection to an array and then accessing a random cell would likely make for more compact code, but possibly less performant code.

Let's consider the case that the Collection you're going to convert toArray has the following underlying data structure:

  1. Array. It's possible that something like System.arrayCopy may be used to do this conversion in constant time.
  2. Linked-list/tree. In this case creating an array will almost certainly take linear time. A new array will have to be allocated and then populated by walking the data structure.

In either case you've also added additional memory overhead by converting to an array first.

At the end of the day, understanding how your algorithm will perform depends on anticipating what the distribution of Collection instance types you expect to be dealing with.

If you are dealing with Collections that use an array as an underlying data structure then generating a random integer and accessing that cell should take (a very short) constant time.

On the other hand if your underlying data structure is a linked list, or tree then generating a random integer and accessing that cell could take linear time!

If you're happy with linear time, then you can probably leave your solution as be. If you are willing to put together a solution less general by restricting the types of Collections that can be used, you could likely improve performance.

Community
  • 1
  • 1
vpiTriumph
  • 3,116
  • 2
  • 27
  • 39
0

Collection doesn't allow direct access to elements by index. It is the most generic interface of the JDK collection classes and thus covers both ordered and unordered lists.

If you insist on using the Collection interface, something like this should work, which is similar to your code but more generic:

public static <T> T getRandomElement(Collection<T> c) {
    int random = (int) (Math.random() * c.size());
    Iterator<T> it = c.iterator();
    for (int i = 0; i < random; i++) {
        it.next();
    }
    return it.next();
}

However you should be asking yourself if using the List interface instead might be possible in your case, as it allows simple access by (random) index using the get method.