Lets assume I have an array containing 2 million ids. I now want to retrieve a sample of these ids. At the moment I use a random sampling as proposed in this questions answer here.
private static void shuffleScoreArray(ScoreDoc[] ar) {
Random rnd = new Random();
for (int i = ar.length - 1; i > 0; i--) {
int index = rnd.nextInt(i + 1);
// Simple swap
ScoreDoc a = ar[index];
ar[index] = ar[i];
ar[i] = a;
}
}
This works great and all, but how can I now retrieve a non random (and more or less good distributed - doesn't have to be 100% equally) sampling? Non random in this case means if I call the function with the same input array twice I will both times get the same result sample.
I just did a lot of research on SO and Google but couldn't find an approach helping me in this case. Most approaches on SO seem to deal with random sampling approaches or with increasing performance steps.
What I could imagine (but don't know if working) is that you always use the same Random object, but I'm unsure on how to put this into working as intended java code.
Thanks a lot for every thought and answer you're sharing with me.