6

We have a requirement to shuffle an ArrayList using a seed

The code is something like:

List<String> tempList =  new ArrayList<>()
//code to populdate the tempList
Random rng = new Random(2018);
Collections.shuffle(tempList, rng);

P.S. The reason we provided a static Random seed is to make sure it always produce the same result after the shuffling.

What we have observed is that the shuffled result is different on dev machines(Mac) from the one on our build machines(Linux)

I am wondering if this method itself is platform dependent?

JDK details Mac is on :

Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)

Build Machine(I need more time to find out more details as I don't have access):

jdk1.8.0_162
DingDong
  • 86
  • 5
  • Are you using different JVMs? Or the official from Oracle (Hotspot)? Which version? Can you provide more information about the test setup and the machines? What about the `Random` object itself? Try generating some numbers using `nextInt` and see if the sequences are different too. – Zabuzard Aug 05 '18 at 23:58
  • Will it be different if it is on different JVMs? I will find out the version details and update the question – DingDong Aug 06 '18 at 00:02
  • 1
    What's the purpose of a predictable shuffling? – LMC Aug 06 '18 at 00:05
  • 1
    It needs to be shuffled so the list is not alphabetically ordered. The reason it needs to be predictable is that it is the same shuffled order on dev machines and build machines. – DingDong Aug 06 '18 at 00:21
  • Re-opening because even though the RNG is guaranteed to generate the same sequence, different JVM versions/configurations may still use different shuffling algorithms. – Thilo Aug 06 '18 at 04:05
  • 1
    There should be no variation, even if the Java versions are different. From [the docs](https://docs.oracle.com/javase/10/docs/api/java/util/Random.html): “In order to guarantee this property, particular algorithms are specified for the class Random. Java implementations must use all the algorithms shown here for the class Random, for the sake of absolute portability of Java code.” – VGR Aug 06 '18 at 05:29
  • So the shuffle method may differ from OS to OS right? – DingDong Aug 06 '18 at 05:30
  • You state that the problematic code is “something like” what you’ve posted. Are you in fact initializing your Random instance with a constant numeric argument? `new Random()` would not be the same across program invocations. – VGR Aug 06 '18 at 05:33
  • I am deliberately initializing the Random instance with a constant numeric argument, if I had posted the original code it will be too much and too distractive. – DingDong Aug 06 '18 at 07:41
  • In general, portability across JVMs or versions is often not ensured. In particular if you use an external JVM instead of Hotspot. Because of that you should generally use the exact same JVM and version, if possible. Please check if the `Random` generates different sequences too. If so, it's the shuffle algorithms fault. The documentation explicitly states that all implementations of `Random` **must** produce the same sequence. If it doesn't, the implementation is not conform to the documentation. – Zabuzard Aug 06 '18 at 09:27

1 Answers1

0

As I understand it, you're asking whether this Java program is guaranteed to print bcdea on every Java implementation in existence.

import java.util.Random;
import java.util.ArrayList;
import java.util.Collections;

class Main {
    public static void main(String[] args) {
        Random rng = new Random(42);
        ArrayList<String> list = new ArrayList<String>();
        list.add("a");
        list.add("b");
        list.add("c");
        list.add("d");
        list.add("e");
        Collections.shuffle(list, rng);
        for (String s : list) System.out.print(s);
    }
};

tio.run says that it produces the same output at least between "OpenJDK 8" and whatever-it-is they bill as "JDK". But that's an extremely small and boring sample.

The official Oracle docs are not reassuring:

Randomly permute the specified list using the specified source of randomness. All permutations occur with equal likelihood assuming that the source of randomness is fair.

This implementation traverses the list backwards, from the last element up to the second, repeatedly swapping a randomly selected element into the "current position". Elements are randomly selected from the portion of the list that runs from the first element to the current position, inclusive. [Emphasis added.]

That is, Oracle doesn't claim that every implementation works that way; nor do they even bother to document in what way the swapped element is "randomly selected."

Btw, I arrived at your question after seeing Collections.shuffle used in U.S. voting system software with exactly this assumption: that its behavior is reproducible no matter what JDK you're using. I agree that it's not at all clear whether this is the case.

(It's not the case for C++ std::shuffle, by the way. There, different library implementations can and do give different shuffles for the same input.)

Quuxplusone
  • 23,928
  • 8
  • 94
  • 159