I have a Java 8 stream from which I want to (uniformly) randomly select an element. The stream can contain anywhere from zero to tens of thousands of elements.
I have implemented an algorithm that selects one using a MapReduce-like pattern, but for the very small streams it would probably be more efficient to just collect the items into a List and return one with a random index. For that I have to count them, however. Streams do have a count() method but that counts them all, I'm not really interested in the actual count, all I care about is whether it contains more than a to-be-determined number. Does anyone know if such a method exists? I can't find it but there might be something I'm overlooking or some clever trick for finding it anyway.
P.S.: I'm aware that sometimes it's not necessary to optimize code; but I would like to try it nonetheless, just for the experience. I'm a student.
P.P.S.: I've copied my algorithm here, in case anyone's interested (or wants to look for bugs, I haven't tested it yet ;-)
stream
.parallel()
.map(t -> new Pair<T, Integer>(t, 1))
.reduce((Pair<T, Integer> t, Pair<T, Integer> u) -> {
if (rand.nextDouble() <= (t.getValue1() / (double) (t.getValue1() + u.getValue1()))) {
return new Pair<>(t.getValue0(), t.getValue1() + u.getValue1());
} else {
return new Pair<>(u.getValue0(), t.getValue1() + u.getValue1());
}
})
.map(t -> t.getValue0());
(The pairs are from org.javatuples, now that Java supports functional programming-like interfaces the lack of tuples does become a bit painful).