6

I have been trying to find clear contract from official Java documentation as to in which order Java streams, once a terminal operation is called, process the elements and call intermediate operations.

For example lets look at these examples that use both Java stream version and plain iteration version (both producing same outcome) .

Example1:

    List<Integer> ints = Arrays.asList(1, 2, 3, 4, 5);
    Function<Integer, Integer> map1 = i -> i;
    Predicate<Integer> f1 = i -> i > 2;

    public int findFirstUsingStreams(List<Integer> ints){
        return ints.stream().map(map1).filter(f1).findFirst().orElse(-1);
    }

    public int findFirstUsingLoopV1(List<Integer> ints){
        for (int i : ints){
            int mappedI = map1.apply(i);
            if ( f1.test(mappedI) ) return mappedI;
        }
        return -1;
    }

    public int findFirstUsingLoopV2(List<Integer> ints){
        List<Integer> mappedInts = new ArrayList<>( ints.size() );

        for (int i : ints){
            int mappedI = map1.apply(i);
            mappedInts.add(mappedI);
        }

        for (int mappedI : mappedInts){
            if ( f1.test(mappedI) ) return mappedI;
        }
        return -1;
    }

Will the Java stream in findFirstUsingStreams method after findFirst is called run map1 in the order described in findFirstUsingLoopV1 (map is not run for all elements) or as described in findFirstUsingLoopV2 (map is run for all elements)?

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of map1 calls?

Example2:

Predicate<Integer> f1 = i -> i > 2;
Predicate<Integer> f2 = i -> i > 3;


public List<Integer> collectUsingStreams(List<Integer> ints){
    return ints.stream().filter(f1).filter(f2).collect( Collectors.toList() );
}

public List<Integer> collectUsingLoopV1(List<Integer> ints){
    List<Integer> result = new ArrayList<>();
    for (int i : ints){
        if ( f1.test(i) && f2.test(i) ) result.add(i);
    }
    return result;
}

public List<Integer> collectUsingLoopV2(List<Integer> ints){
    List<Integer> result = new ArrayList<>();
    for (int i : ints){
        if ( f2.test(i) && f1.test(i) ) result.add(i);
    }
    return result;
}

Again will the Java stream in collectUsingStreams method after collect is called run f1 and f2 in the order described in collectUsingLoopV1 (f1 is evaluated before f2) or as described in collectUsingLoopV2 (f2 is evaluated before f1)?

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of f1 and f2 calls?

Edit

Thanks for all the answers and comment but unfortunately I still dont see good explanation on the ordering of processing the elements. The docs do say that the encounter order will be preserved for lists but they dont specify how those elements will be processed. For example in case of findFirst the docs guarantees that map1 will first see 1 then 2 but it does not say that map1 wont be executed for 4 and 5. Does it mean that we cant guarantee that our order of processing will be as we expect the in fure versions of Java? Probably yes.

tsolakp
  • 5,858
  • 1
  • 22
  • 28
  • 2
    There is no guarantee of the order, and provided the result isn't changed the order could be anything. Functional programming assumes no side effects so if you have a side effect which depends on the order of processing, this breaks this basic assumption. – Peter Lawrey Dec 22 '17 at 17:35
  • @Peter Lawrey. I agree that is a safe assumption but I have seen answers on SO that assume that there is an order. For example this answer does and author defends that the order will be maintained for `findFirst`: https://stackoverflow.com/questions/43280207/return-first-non-null-value. – tsolakp Dec 22 '17 at 17:50
  • Check this: https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#Ordering. If underlying source is ordered then in case of sequential stream you could assume that elements will be processed in the same order unless you use `sorted()` – Ivan Dec 22 '17 at 18:12
  • 2
    You are asking about two orders: processing ( which is nt defined to be preserved - and parallel to your streams and see ) and encouter order which is defined to be preserved if the source of the stream is ordered and there are no intermediate operations that break it, be it parallel or sequential. – Eugene Dec 22 '17 at 18:23
  • @Eugene. Are you saying that we cannot be sure that in future versions of Java the V2 versions of above methods won't happen? – tsolakp Dec 22 '17 at 21:47
  • 1
    I would say Example1 is deliberately undefined, and in practice depends on whether there are [stateful operations](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#StreamOps) in the pipeline. For Example2 I would say the order must follow the pipeline order. I'm not sure if or where it's documented, but I think it's a central premise to the use of streams that operations are executed in order. – shmosel Dec 22 '17 at 22:00
  • 1
    @shmosel indeed, the second example only works, because both filters are independent, which the implementation can’t assume. How many stream pipelines start with filter(Objects::nonNull)? It would break the entire operation if that check was allowed to be postponed (swapped with subsequent operations). – Holger Jan 23 '18 at 12:04

1 Answers1

12

And will that order change in future versions of Java or there is an official documentation that guarantees us the order of map1 calls?

The javadocs, including package summaries (people often overlook those somehow), are the API contract. Behavior that is observable but not defined by the javadocs generally should be considered an implementation detail which may change in future versions.

So if it cannot be found in the javadocs then there is no guarantee.

In which order stream pipeline stages are called and interleaved is not specified. What is specified is under which circumstances the so-called encounter order of streams is preserved. Assuming an ordered stream an implementation is still allowed to perform any interleaving, batching and internal reordering that would preserve the encounter order. E.g. a sorted(comparator).filter(predicate).findFirst() could be internally replaced with a filter(predicate).min(comparator) which of course significantly affects the way in which both the Predicate and Comparator are invoked and yet yield the same results, even in an ordered stream.

Does it mean that we cant guarantee that our order of processing will be as we expect the in fure versions of Java? Probably yes.

Yes, and that should not be an issue since most of the stream APIs require callbacks to be stateless and free of side-effects, which among other things means they should not care about the internal execution order of the stream pipeline and the results should be identical, modulo the leeway granted by unordered streams.

The explicit requirements and absence of guarantees give the JDK developers flexibility how the streams are implemented.

If you have any particular case in mind where this would matter you should ask a more concrete question, about an execution reordering that you'd like to avoid.


You should always keep in mind that streams could be parallel, e.g. instances passed by 3rd-party code, or contain a source or intermediate stream operation that is less lazy than it could theoretically be (currently flatMap is such an operation). Stream pipelines can also contain custom behavior if someone extracts and rewraps a spliterator or uses a custom implementation of the Stream interface.

So while particular stream implementations might exhibit some predictable behavior when you use them in a specific way and future optimizations for that specific case could be considered highly unlikely this does not generalize to all possible stream pipelines and therefore the APIs can't provide such general guarantees.

the8472
  • 40,999
  • 5
  • 70
  • 122
  • 2
    that is one interesting example with sorted! Plus one. – Eugene Dec 22 '17 at 19:50
  • @Eugene indeed, interesting example. But the question’s second example won’t work. – Holger Jan 23 '18 at 12:09
  • @Holger in the sense that `f1` and `f2` could be inverted? – Eugene Jan 23 '18 at 12:19
  • @Eugene consider `.filter(Objects::nonNull) .filter(e -> e.someProperty() == x)`… – Holger Jan 23 '18 at 12:20
  • @Holger well yes :) point understood. But in that case I see no reason why `f1` and `f2` could not be replaced (even if that would make one absolute probably), I just have no idea how a jvm is supposed to reason about this in the first place, thinking this would be impossible. – Eugene Jan 23 '18 at 12:22
  • @Holger same thing would *somehow* be my point against `filter(x->true)` or worse `filter(x -> if(x==x) return true; return false)`. I see this is pretty darn hard to "understand" for the Stream API to actually be replaced or handled in a smarter way – Eugene Jan 23 '18 at 12:26
  • @Eugene I don’t know what you mean with “why `f1` and `f2` could not be replaced”. Whatever the JRE does, it has to maintain the semantic, i.e. not to throw an NPE in my example. When maintaining the semantic, it can do whatever it wants. Short-cutting the entire operation when there is a `filter(x -> false)`, staying *sized* when there only is a `filter(x -> true)`, etc. Not with the current JVM, though, it would be even possible to implement it atop the current reflective capabilities, but cause an unjustified overhead as `x -> true` and `x -> false` rarely happen in real life… – Holger Jan 23 '18 at 12:40
  • @Holger I really meant inverted there, not replaced; or I guess replaced with each other... – Eugene Jan 23 '18 at 12:41
  • @Eugene with sufficient inlining f1 could be completely elided because the f2 check narrows the range further. But one could only see that in the generated assembly, not by adding print statements or java-level debugging. – the8472 Jan 24 '18 at 19:36
  • @the8472 that was my point entirely with *even if that would make one absolute probably*. The question I had was a bit different, don't know if I made myself clear though – Eugene Jan 24 '18 at 19:38