Consider the following Scala code:
var myStream = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
var cnt = 0;
def duplicate(in: Stream[Int]): Stream[Int] = {
in.flatMap(elem => {
cnt += 1
Stream(elem, elem)
})
}
myStream = duplicate(myStream)
myStream = duplicate(myStream)
myStream = duplicate(myStream)
myStream = duplicate(myStream)
"First elem: " + myStream.head
"cnt: " + cnt
"Size: " + myStream.size
"cnt: " + cnt
Last 4 lines prints:
First elem: 1
cnt: 4
Size: 160
cnt: 150
Which is exactly as expected. To calculate first element it is required to run flatMap 4 times therefore cnt is equal to 4 at that step.
Now consider Java example:
static int cnt = 0;
public static void main(String[] args) {
IntStream range = IntStream.range(1, 11);
range = duplicate(range);
range = duplicate(range);
range = duplicate(range);
range = duplicate(range);
System.out.println(
range.findAny()//range.findFirst()
);
System.out.println(cnt);
//System.out.println(
// range.count()
//);
//System.out.println(cnt);
}
private static IntStream duplicate(IntStream elem) {
return elem.flatMap(e -> {
++cnt;
return IntStream.of(e, e);
});
}
Java sample needs to be run twice as find is terminal operation. So after two runs:
first element is: OptionalInt[1]
cnt: 15
Size: 160
cnt: 150
It looks likes steam laziness is somehow broken in Java. I have no idea why Java needs 15 (cnt: 15) flatMap operations to calculate results.
For small examples this is not an issue, but for much more complicated flow this can cause performance problems when one only cares about first stream element.
Yep, my questions is duplicated of that one. Thanks. – slowikps Apr 16 '16 at 14:55