41

I tried to translate the following line of Scala to Java 8 using the Streams API:

// Scala
util.Random.shuffle((1 to 24).toList)

To write the equivalent in Java I created a range of integers:

IntStream.range(1, 25)

I suspected to find a toList method in the stream API, but IntStream only knows the strange method:

collect(
  Supplier<R> supplier, ObjIntConsumer<R> accumulator, BiConsumer<R,R> combiner)

How can I shuffle a list with Java 8 Streams API?

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
deamon
  • 89,107
  • 111
  • 320
  • 448

8 Answers8

50

Here you go:

List<Integer> integers =
    IntStream.range(1, 10)                      // <-- creates a stream of ints
        .boxed()                                // <-- converts them to Integers
        .collect(Collectors.toList());          // <-- collects the values to a list

Collections.shuffle(integers);

System.out.println(integers);

Prints:

[8, 1, 5, 3, 4, 2, 6, 9, 7]
Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
Andrey Chaschev
  • 16,160
  • 5
  • 51
  • 68
  • 1
    In practice, it'd be better to map using Integer::valueOf() in order to take advantage of the Integer cache, even if it hardly matters for such a small range. – Mike Strobel Nov 18 '13 at 21:40
  • 2
    I edited the answer to use `IntStream.boxed()`, which is pretty much canonical. – Louis Wasserman Nov 18 '13 at 21:42
  • I hadn't seen this answer when I posted mine, and didn't know about `boxed()`; this is better than what I had, although I think you can statically import `Collectors.toList` and can let the compiler infer the type of the list. – David Conrad Nov 18 '13 at 21:50
  • Thank you very much. The beauty of Scala is just so obvious to me. – deamon Nov 18 '13 at 21:51
  • Yeah, I am still surprised by the verbosity of Java 8. Chances are this would be resolved in libraries like Guava. – Andrey Chaschev Nov 18 '13 at 21:54
  • @AndreyChaschev - I'm surprised that people are surprised! 1) The verbosity of Java 8 is a consequence of the need to maintain source code compatibility with earlier versions of the Java language. 2) This is not something a 3rd party library could "fix". 3) If this is a serious problem for you, consider using a different JVM-based language. (And an interesting new contender just went "1.0" ...) – Stephen C Nov 18 '13 at 22:48
  • @StephenC Many thanks for a tip - haven't thought of Dart as a JVM language and will definitely try it when it gets better JVM/IDE support. As for example, I might be wrong, but this would be more concise and efficient if collectors/consumers for primitives had been introduced. That's what I'm expecting to see in i.e. Trove collections... – Andrey Chaschev Nov 19 '13 at 08:52
  • @AndreyChaschev I think StephenC meant Ceylon, not Dart. However I think Scala is the better language. Maybe Kotlin has a good chance because of the good IDE support from JetBrains. – deamon Nov 19 '13 at 13:23
  • @deamon I have been expecting Kotlin to shoot, but now I have doubts it will survive Dart and Scala. – Andrey Chaschev Nov 19 '13 at 13:55
  • 30
    Ugh. It is not really a streams solution. – ncmathsadist Apr 29 '15 at 00:25
  • 15
    This is not guaranteed to work as written per [the API documentation](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collectors.html#toList--), since `toList()` doesn't guarantee the mutability of the `List` returned. This works in practice since the current implementation of `toList()` happens to return an `ArrayList`, which is mutable. To ensure continued correctness if the implementation changes, we'd could explicitly declare the collection type (`Collectors.toCollection(ArrayList::new)`), or copy the results of `toList` into a mutable list (`new ArrayList<>(integers)`). – M. Justin Sep 21 '18 at 20:14
  • 1
    @M.Justin or just switch to Kotlin/Scala to avoid lengthy talks – Andrey Chaschev Sep 23 '18 at 02:31
50

You may find the following toShuffledList() method useful.

private static final Collector<?, ?, ?> SHUFFLER = Collectors.collectingAndThen(
        Collectors.toCollection(ArrayList::new),
        list -> {
            Collections.shuffle(list);
            return list;
        }
);

@SuppressWarnings("unchecked")
public static <T> Collector<T, ?, List<T>> toShuffledList() {
    return (Collector<T, ?, List<T>>) SHUFFLER;
}

This enables the following kind of one-liner:

IntStream.rangeClosed('A', 'Z')
         .mapToObj(a -> (char) a)
         .collect(toShuffledList())
         .forEach(System.out::print);

Example output:

AVBFYXIMUDENOTHCRJKWGQZSPL
Paul Boddington
  • 37,127
  • 10
  • 65
  • 116
  • 6
    This should be the answer – bhantol Mar 21 '18 at 02:49
  • 3
    This is not guaranteed to work as written per [the API documentation](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collectors.html#toList--), since `toList()` doesn't guarantee the mutability of the `List` returned. This works in practice since the current implementation of `toList()` happens to return an `ArrayList`, which is mutable. To ensure continued correctness if the implementation changes, we'd could explicitly declare the collection type (`Collectors.toCollection(ArrayList::new)`), or copy the results of `toList` into a mutable list (`new ArrayList<>(integers)`). – M. Justin Sep 21 '18 at 20:15
  • 7
    @M.Justin Thanks, I've edited the answer. This is one of the most infuriating things about the Stream API. It doesn't guarantee mutability or thread safety of the returned List, or even whether it has an O(1) `get` method. Along with so much of the Stream API, this is just terrible. Taking the argument to its logical conclusion, the only methods on the returned lists you should ever actually use are `size` and `iterator`. – Paul Boddington Nov 06 '18 at 17:22
  • 2
    @PaulBoddington Taking it even further, it's worth noting that some specialty collection implementations don't even have a O(1) `size` operation (though those are definitely outliers) — https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentLinkedQueue.html#size-- – M. Justin Mar 05 '19 at 17:09
16

You can use a custom comparator that "sorts" the values by a random value:

public final class RandomComparator<T> implements Comparator<T> {

    private final Map<T, Integer> map = new IdentityHashMap<>();
    private final Random random;

    public RandomComparator() {
        this(new Random());
    }

    public RandomComparator(Random random) {
        this.random = random;
    }

    @Override
    public int compare(T t1, T t2) {
        return Integer.compare(valueFor(t1), valueFor(t2));
    }

    private int valueFor(T t) {
        synchronized (map) {
            return map.computeIfAbsent(t, ignore -> random.nextInt());
        }
    }

}

Each object in the stream is (lazily) associated a random integer value, on which we sort. The synchronization on the map is to deal with parallel streams.

You can then use it like that:

IntStream.rangeClosed(0, 24).boxed()
    .sorted(new RandomComparator<>())
    .collect(Collectors.toList());

The advantage of this solution is that it integrates within the stream pipeline.

Xavier
  • 1,536
  • 2
  • 16
  • 29
  • 1
    See [this answer](http://stackoverflow.com/a/29251881/1553851) for a nice alternative. – shmosel Aug 26 '16 at 20:34
  • 6
    this answer doesn't work well anymore due to a change in the sort algorithm: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6804124 that can complain when the comparator doesn't respect a basic contract. – user1928596 Jul 10 '17 at 00:48
  • The change in sort algorithm happened in 2009, so nothing new here. Do you have examples where this breaks? The only issue I see is inconsistency with equals in case of (unlikely) collisions in random.nextInt(), and even that shouldn't prevent the shuffling to happen. – Xavier Jul 18 '17 at 08:57
  • @user1928596 What contract is the comparator breaking? Since it's storing a mapping of each value and comparing those, it meets all the criteria described in the requirements of `Comparator.compareTo`. – M. Justin Sep 21 '18 at 20:38
  • 2
    This will not fully shuffle if there are duplicate elements in the list, as they will always end up adjacent to each other. For instance, the list `[1,5,1,1,5]` will become either `[5,5,1,1,1]` or `[1,1,1,5,5]`. – M. Justin Sep 21 '18 at 20:41
  • 1
    @user1928596 Except it doesn't violate that. It's storing a (random) mapping of `T` => `Integer`, and comparing on that integer. Since the mapping doesn't change during the life of the comparator, `f(A) f(A) { synchronized (map) { return map.computeIfAbsent(t, ignore -> random.nextInt()); }});`, which perhaps more clearly illustrates how this meets the basic contract. – M. Justin Sep 21 '18 at 22:57
6

If you want to process the whole Stream without too much hassle, you can simply create your own Collector using Collectors.collectingAndThen():

public static <T> Collector<T, ?, Stream<T>> toEagerShuffledStream() {
    return Collectors.collectingAndThen(
      toList(),
      list -> {
          Collections.shuffle(list);
          return list.stream();
      });
}

But this won't perform well if you want to limit() the resulting Stream. In order to overcome this, one could create a custom Spliterator:

package com.pivovarit.stream;

import java.util.List;
import java.util.Objects;
import java.util.Random;
import java.util.RandomAccess;
import java.util.Spliterator;
import java.util.function.Consumer;
import java.util.function.Supplier;

class ImprovedRandomSpliterator<T, LIST extends RandomAccess & List<T>> implements Spliterator<T> {

    private final Random random;
    private final List<T> source;
    private int size;

    ImprovedRandomSpliterator(LIST source, Supplier<? extends Random> random) {
        Objects.requireNonNull(source, "source can't be null");
        Objects.requireNonNull(random, "random can't be null");

        this.source = source;
        this.random = random.get();
        this.size = this.source.size();
    }

    @Override
    public boolean tryAdvance(Consumer<? super T> action) {
        if (size > 0) {
            int nextIdx = random.nextInt(size);
            int lastIdx = --size;

            T last = source.get(lastIdx);
            T elem = source.set(nextIdx, last);
            action.accept(elem);
            return true;
        } else {
            return false;
        }
    }

    @Override
    public Spliterator<T> trySplit() {
        return null;
    }

    @Override
    public long estimateSize() {
        return source.size();
    }

    @Override
    public int characteristics() {
        return SIZED;
    }
}

and then:

public final class RandomCollectors {

    private RandomCollectors() {
    }

    public static <T> Collector<T, ?, Stream<T>> toImprovedLazyShuffledStream() {
        return Collectors.collectingAndThen(
          toCollection(ArrayList::new),
          list -> !list.isEmpty()
            ? StreamSupport.stream(new ImprovedRandomSpliterator<>(list, Random::new), false)
            : Stream.empty());
    }

    public static <T> Collector<T, ?, Stream<T>> toEagerShuffledStream() {
        return Collectors.collectingAndThen(
          toCollection(ArrayList::new),
          list -> {
              Collections.shuffle(list);
              return list.stream();
          });
    }
}

I explained the performance considerations here: https://4comprehension.com/implementing-a-randomized-stream-spliterator-in-java/

Grzegorz Piwowarek
  • 13,172
  • 8
  • 62
  • 93
3

To perform a shuffle efficiently you need all the values in advance. You can use Collections.shuffle() after you have converted the stream to a list like you do in Scala.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
1

If you're looking for a "streaming only" solution and a deterministic, merely "haphazard" ordering versus a "random" ordering is Good Enough, you can always sort your ints by a hash value:

List<Integer> xs=IntStream.range(0, 10)
    .boxed()
    .sorted( (a, b) -> a.hashCode() - b.hashCode() )
    .collect(Collectors.toList());

If you'd rather have an int[] than a List<Integer>, you can just unbox them afterwards. Unfortunately, you have go through the boxing step to apply a custom Comparator, so there's no eliminating that part of the process.

List<Integer> ys=IntStream.range(0, 10)
    .boxed()
    .sorted( (a, b) -> a.hashCode() - b.hashCode() )
    .mapToInt( a -> a.intValue())
    .toArray();
sigpwned
  • 6,957
  • 5
  • 28
  • 48
  • 1
    I don't think this will work. The problem is that `Integer::hashCode` returns the same thing as `Integer::intValue`. (It is in the javadoc!) And even if you use `System::identityHashCode`, it is going to be system dependent how random the shuffle is, for a couple of reasons. – Stephen C Jun 04 '21 at 04:03
1
List<Integer> randomShuffledRange(int startInclusive, int endExclusive) {
    return new Random().ints(startInclusive, endExclusive)
            .distinct()
            .limit(endExclusive-startInclusive)
            .boxed()
            .collect(Collectors.toList());
}

var shuffled = randomShuffledRange(1, 10);
System.out.println(shuffled);

Example output:

[4, 6, 8, 9, 1, 7, 3, 5, 2]
Ron McLeod
  • 643
  • 7
  • 12
-2

This is my one line solution: I am picking one random color:

colourRepository.findAll().stream().sorted((o1,o2)-> RandomUtils.nextInt(-1,1)).findFirst().get()
kozla13
  • 1,854
  • 3
  • 23
  • 35
  • 5
    This `Comparator` violates the general contract for `Comparator`. Comparing the same two values isn't consistent, and comparing `x` with `y` and `y` with `x` will not necessarily return opposite results. With the right input and random seed, I was able to get sorting this way to throw an exception: `java.lang.IllegalArgumentException: Comparison method violates its general contract!`. On my version of Java, one specific example is: `Random r = new Random(9); Collections.nCopies(32, 1).stream().sorted((o1, o2) -> r.nextInt(3) - 1).findFirst().get()` – M. Justin Sep 21 '18 at 21:03