I am experiencing something weird.
I have a big ArrayList of Long numbers. It contains about 200k numbers in ascending order. These numbers are always distinct; they are not necessarily consecutive, but some groups of them usually are.
I want to extract a sorted sample of 5k from this list, so basically this is my approach:
- I call
java.util.Collections.shuffle(list);
- I extract the first 5k elements from the now shuffled
list
- I sort the extracted elements in ascending order
My result is somewhat weird, though. Many of my extracted random Longs seem suspiciously close to each other, if not even consecutive. For instance, I got:
...
38414931,
38414932,
38414935,
38414937,
38414938,
38414939,
38414941,
...
This does not definitely look random :/
There is an even stranger thing.
While debugging this, I tried to write into files both the initial list
and the extracted sample in order to compare them.
If I do this, my problem seems to disappear, and my extracted Longs look like proper random numbers.
I have repeated this many times, of course, and every time I did I experienced these two behaviours.
Am I missing something?
EDIT: Here is the code I am using:
List<Long> allNumbers = <getting my list>;
---> if here I write allNumbers into a file, it seems to work fine
Collections.shuffle(allNumbers);
HashSet<Long> randomNumbers = new HashSet<>();
for (int i = 0; i < 5000; i++) {
randomNumbers.add(allNumbers.get(i));
}