14

Can anyone point to a official Java documentation which describes how many times Stream will invoke each "non-interfering and stateless" intermediate operation for each element.

For example:

Arrays.asList("1", "2", "3", "4").stream()
        .filter(s -> check(s))
        .forEach(s -> System.out.println(s));

public boolean check(Object o) {
    return true;
} 

The above currently will invoke check method 4 times.

Is it possible that in the current or future versions of JDKs the check method gets executed more or less times than the number of elements in the stream created from List or any other standard Java API?

Mani
  • 1,068
  • 3
  • 13
  • 27
tsolakp
  • 5,858
  • 1
  • 22
  • 28

2 Answers2

15

This does not have to do with the source of the stream, but rather the terminal operation and optimization done in the stream implementation itself. For example:

Stream.of(1,2,3,4)
      .map(x -> x + 1)
      .count();

Since java-9, map will not get executed a single time.

Or:

 someTreeSet.stream()
            .sorted()
            .findFirst();

sorted might not get executed at all, since the source is a TreeSet and getting the first element is trivial, but if this is implemented inside stream API or not, is a different question.

So the real answer here - it depends, but I can't imagine one operation that would get executed more that the numbers of elements in the source.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • 2
    "*but I can't imagine one operation that would get executed more that the numbers of elements in the source*" - Well, if you specify a `Comparator` for sorting, its `compare` will definitely be invoked more times than the number of elements, but that's just nitpicking. +1 – Fureeish Nov 09 '18 at 23:46
  • 1
    @Fureeish no no, very good nitpick, indeed you are right; but it seems `sorted` is the only one that could do that. `distinct` does not for example – Eugene Nov 09 '18 at 23:53
  • 1
    Some time ago I asked [this question](https://stackoverflow.com/questions/32656888/recursive-use-of-stream-flatmap) about jdk8 `flatMap` (also see [this related question](https://stackoverflow.com/questions/29229373/why-filter-after-flatmap-is-not-completely-lazy-in-java-streams)). In short, in jdk8&9 `flatMap` is not completely lazy and elements can be *visited* more than once, while in jdk10 this behavior has been fixed. – fps Nov 10 '18 at 00:13
  • @FedericoPeraltaSchaffner I think that one is about laziness only, not how many times an element can be visited – Eugene Nov 10 '18 at 00:14
  • Yes, but the thing is that for older versions, an element might end up being visited more than once (as my question shows by means of the `peek` method) – fps Nov 10 '18 at 00:16
  • Agree that it depends on terminal operation too. But even for common one like `forEach`. Will its behavior change like it did with `map/count`. Or what about `map/forEach` or `filter/count`? – tsolakp Nov 10 '18 at 00:16
  • @tsolakp you would to reason about them individually, `map/forEach` will always get executed, unless the stream API or JIT I guess can prove that `forEach` is really a NO-OP, `filter/count` will always get executed too, since `filter` breaks the `SIZED` flag... – Eugene Nov 10 '18 at 00:18
  • @Eugene. `Will always` does not exclude more then once for each. – tsolakp Nov 10 '18 at 00:20
  • @tsolakp well of course if you use a `flatMap` like `Stream.of(1, 1) .flatMap(x -> Stream.of(x, x)) .forEach(System.out::println);` will get executed 4 times, if that is what you mean – Eugene Nov 10 '18 at 00:22
  • @Eugene. No, I am still talking about question's or your first example. – tsolakp Nov 10 '18 at 00:24
  • @tsolakp you have to make up your mind, in your question you are talking about an intermediate operation, and here about a terminal one, `forEach` - even the same of it suggests what it will do and for how many elements – Eugene Nov 10 '18 at 00:32
  • 1
    @FedericoPeraltaSchaffner as said earlier I can write it as `Stream.of(1, 1) .flatMap(x -> Stream.of(x, x)) .forEach(System.out::println);` does it mean that it gets visited more than once? – Eugene Nov 10 '18 at 00:34
  • 2
    4 times according to my own maths ;) I get your point, I was thinking about tricky use cases with flatMap, followed by filter. In such hypothetical cases, maybe reality is counterintuituve. Let me try to find an example... – fps Nov 10 '18 at 00:43
  • And after some digging and some research, I can confidently say that while in some older versions of Java (8&9) `flatMap` wasn't completely lazy, I couldn't find an example that process elements of the stream more than once. – fps Nov 10 '18 at 15:19
3

From the documentation:

Laziness-seeking. Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.

By that virtue, because filter is an intermediate operation which creates a new Stream as part of its operation, due to its laziness, it will only ever invoke the filter predicate once per element as part of its rebuilding of the stream.

The only way that your method would possibly have a different number of invocations against it in the stream is if the stream were somehow mutated between states, which given the fact that nothing in a stream actually runs until the terminal operation, would only realistically be possible due to a bug upstream.

Makoto
  • 104,088
  • 27
  • 192
  • 230
  • So what does "Laziness" mean in Java streams? To me it means that the actual invocation of intermediate operation will happen after invocation of terminal operation. But it does not guarantee that intermediate operation will be invoked only once per element (or not at all depending on terminal operation). – tsolakp Nov 10 '18 at 00:32