11

I have a List of A, B, C.

C reduce A reduce B != A reduce B reduce C (however, A reduce (B reduce C) is OK).

In other words, my reduction operation is associative but not commutative.

Does java enforce on an ordered sequential stream (such as the default one from a list) that reduction will always happen according to the encounter order? That is to say, will java reorder reductions (such that B reduce A instead of A reduce B)?

(Hopefully this is clear enough).

edit to add a little demo and maybe helping to clarify

Stream.of(" cats ", " eat ", " bats ")
  .reduce("", (a, b) -> a + b); // cats eat bats

With the above, could the output ever be "bats cats eat" or "eat bats cats"? Is that guaranteed somewhere in the spec?

Andremoniy
  • 34,031
  • 20
  • 135
  • 241
Cogman
  • 2,070
  • 19
  • 36
  • For such a stupid person as my self: what is **reduction** operation?? – Andremoniy Mar 27 '18 at 18:43
  • 1
    @Andremoniy https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#Reduction – shmosel Mar 27 '18 at 18:44
  • 1
    @shmosel okay, but this is a reduction of the list, not of the particular list elements!! – Andremoniy Mar 27 '18 at 18:46
  • @Andremoniy I assume `reduce` here is a reference to the reduction operation. – shmosel Mar 27 '18 at 18:46
  • @shmosel I still don't understand: what does it mean `C reduce A reduce B`? – Andremoniy Mar 27 '18 at 18:47
  • 1
    @Andremoniy I think it means `op.apply(op.apply(a, b), c)`, where `op` is a `BinaryOperator`. – shmosel Mar 27 '18 at 18:49
  • To answer the question, I can't find it in the documentation, but I'm sure `reduce()` respects the encounter order, if there is one. – shmosel Mar 27 '18 at 18:51
  • 1
    @DD You've managed to make things a whole lot more confusing. – shmosel Mar 27 '18 at 19:03
  • @shmosel Can this not be because `BinaryOperator` is associative? – SSC Mar 27 '18 at 19:10
  • @Cogman you should give an example with code of what you mean – Jose Da Silva Gomes Mar 27 '18 at 19:14
  • @shmosel I think now I get the meaning of the question and provided an alternative answer ;) – Andremoniy Mar 27 '18 at 19:18
  • @BrianGoetz What do you mean? Doesn't associative mean `("a" + "b") + "c"` equals `"a" + ("b" + "c")`? – shmosel Mar 27 '18 at 21:58
  • @BrianGoetz are you referring to my addition of `+ " " +` or is there some deeper thing with string concatenation that I don't understand? (removed it just in case) – Cogman Mar 27 '18 at 22:54
  • I think that would still be associative. FYI, you can replace `(a, b) -> a + b` with `String::concat`. – shmosel Mar 27 '18 at 23:06
  • 1
    @shmosel `reduce("", (a, b) -> a + " " + b)` would violate the contract regarding the identity value, as `f(id, x) == x` doesn’t hold. As a consequence you truly get unwanted spurious spaces when using it with a parallel stream. Using `String::concat` is fine, which shouldn’t come at a surprise as `.reduce("", String::concat)` can be found as example of a valid (but inefficient) reduction [right in the documentation](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#MutableReduction). – Holger Mar 28 '18 at 12:31
  • @Holger I see. But that problem is specific to the overload that accepts an identity, right? – shmosel Mar 28 '18 at 15:55
  • @shmosel as far as I can see, yes. Without an identity, it’s the same logic as `Collectors.joining(" ")` (like `reduce((a, b) -> a + " " + b).orElse("")`). – Holger Mar 28 '18 at 16:20

4 Answers4

10

According to the specification it respects the order of the elements.

A proof is very simple. The specification claims that a reduction function has to be associative.

However, associativity it self doesn't make any sense if the order is not preserved. According to the mathematical definition of the associative property:

Within an expression containing two or more occurrences in a row of the same associative operator, the order in which the operations are performed does not matter as long as the sequence of the operands is not changed.

In other words, associative property doesn't imply that:

(a + b) + c = (a + c) + b

It only allows an arbitrary permutation of the order in which operations are applied.

Andremoniy
  • 34,031
  • 20
  • 135
  • 241
8

You have asked two questions in one.

  1. Does java enforce on an ordered sequential stream (such as the default one from a list) that reduction will always happen according to the encounter order?

Assuming “will always happen” is referring to the order of the function evaluation, the answer is no, this is not guaranteed.

  1. Stream.of(" cats ", " eat ", " bats ")
      .reduce("", (a, b) -> a + b); // cats eat bats
    
    With the above, could the output ever be "bats cats eat" or "eat bats cats"? Is that guaranteed somewhere in the spec?

Regardless of the evaluation order of the reduction function (the processing order), the result is guaranteed to be " cats eat bats ", correctly reflecting the encounter order (see also this answer). To ensure that the unspecified processing order still yields the correct result regarding the encounter order, the reduction function must be associative, as specified

Note that the documentation even shows .reduce("", String::concat) as an example of a valid, but inefficient reduction function. Similarly, (a,b) -> b has been acknowledged as a valid way to get the last element of a stream.

The key point is given in the “Associativity” section of the documentation:

Associativity

An operator or function op is associative if the following holds:

(a op b) op c == a op (b op c)

The importance of this to parallel evaluation can be seen if we expand this to four terms:

a op b op c op d == (a op b) op (c op d)

So we can evaluate (a op b) in parallel with (c op d), and then invoke op on the results.

Examples of associative operations include numeric addition, min, and max, and string concatenation.

Holger
  • 285,553
  • 42
  • 434
  • 765
3

When you use Stream.of() the doc says:

Returns a sequential ordered stream whose elements are the specified values.

So at this point, you know you have a ordered sequential stream, and the javadoc of stream ops also says:

For sequential streams, the presence or absence of an encounter order does not affect performance, only determinism. If a stream is ordered, repeated execution of identical stream pipelines on an identical source will produce an identical result; if it is not ordered, repeated execution might produce different results.

Regarding only to the reduce operation, the result should be identical when the order exists for sequential streams, and even for parallel ordered streams the operation will keep the final order (at least in the current implementations of java8 and java9, in the future some optimizations can occur, but the order of ordered streams using reduce will probably never change).

You have to be carefull of knowing when the stream is ordered. For instance, operations like map or filter preserves the order of the stream, so if you have an ordered stream, you can use this methods and the stream will continue to be ordered.

note: ordered is totally different than sorted.

If a stream is ordered, most operations are constrained to operate on the elements in their encounter order; if the source of a stream is a List containing [1, 2, 3], then the result of executing map(x -> x*2) must be [2, 4, 6]

Edit (according to the comment):

But is not constrained to execute sequentially.

This is why the associativity is necessary, for instance, if you have a stream generated from an array like this { a, b, c, d}, then a + b could be resolved first, then c + d and finally all together (a + b) + (c + d), that's why the operation must be associative. This way if the operation is indeed associative (like it has to be) the final order will be preserved.

Community
  • 1
  • 1
Jose Da Silva Gomes
  • 3,814
  • 3
  • 24
  • 34
  • What concerns me is this blurb in the reduce docs "This is equivalent to ... but is not constrained to execute sequentially." https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#reduce-java.util.function.BinaryOperator- – Cogman Mar 27 '18 at 20:16
  • Well that makes reference to the parallel stream, which uses the same interface. For instance if you have the strings `a`, `b`, `c` and `d` in a parallel stream `a` + `b` could be resolved first, then `c` + `d` and finally all together `a` + `b` + `c` + `d`, that's by the operation must be associative. – Jose Da Silva Gomes Mar 27 '18 at 20:22
  • It's important to emphasize that identical ordered streams are only guaranteed to produce identical results when they are sequential. And even that assertion is [questionable](https://stackoverflow.com/q/34247318/1553851). – shmosel Mar 27 '18 at 21:45
  • This does provide an answer for OP's specific scenario, but I'm pretty sure `reduce()` is guaranteed to preserve ordering for parallel streams as well. – shmosel Mar 27 '18 at 21:56
  • @shmosel i didn't find either that the order is guaranteed when using reduce, but in the current implementations this is truth (*added a note saying this*), i think that's highly improbably anyway, to say the least. – Jose Da Silva Gomes Mar 28 '18 at 05:13
3

What concerns me is this blurb in the reduce docs "This is equivalent to ... but is not constrained to execute sequentially."

Execution order is not the same as encounter order. The stream pipeline is allowed to perform "cats" + ("eat" + "bats") or ("cats" + "eat") + "bats"

the8472
  • 40,999
  • 5
  • 70
  • 122