Why does Stream.reduce take BinaryOperator rather than BiFunction?

Question

For my specific case I want to use functional composition in a reduction; for example:

BiFunction<ImmutableSet<Integer>, ImmutableSet<Integer>, Sets.SetView<Integer>> f = Sets::intersection;
Function<Sets.SetView<Integer>, ImmutableSet<Integer>> g = Sets.SetView::immutableCopy;
BiFunction<ImmutableSet<Integer>, ImmutableSet<Integer>, ImmutableSet<Integer>> biFunction = f.andThen(g);
ImmutableSet<Integer> intersection = Stream.of(ImmutableSet.of(1, 2, 3), ImmutableSet.of(1, 2), ImmutableSet.of(4))
    .reduce(biFunction)
    .orElse(ImmutableSet.of());

This has a compilation error:

argument mismatch BiFunction cannot be converted to BinaryOperator

Instead, I need to do:

ImmutableSet<Integer> intersection = Stream.of(ImmutableSet.of(1, 2, 3), ImmutableSet.of(1, 2), ImmutableSet.of(4))
    .reduce((a, b) -> Sets.intersection(a, b).immutableCopy())
    .orElse(ImmutableSet.of());

However, this loses the point-free style that composition provides.

Why is the Stream API is designed like this? A BinaryOperator is a BiFunction, so wouldn't it make more sense to declare the reduce method's parameter with the supertype?

I suspect the only reason is that `BinaryOperator` is easier to read in the docs; I can't think of any other benefits, given that `BinaryOperator` is a subtype of `BiFunction`. It's not impossible that the API designers just didn't consider it. — kaya3, Feb 17 '20 at 23:45
Besides the fact that your “point-free style” variant is twice the amount of code, it even has more dots in it (and I’m not counting the dots in `::` operators). Or did you rather mean “pointless style”? Why is creating an expensive copy of every intermediate result even a goal? What about `reduce(Sets::intersection).map(ImmutableSet::copyOf).orElse(ImmutableSet.of())`, performing the expensive operation only once at the end? The API design has been discussed in [this Q&A](https://stackoverflow.com/q/35680706/2711488), so if your question is just *why*, it’s a duplicate of it. — Holger, Feb 18 '20 at 09:52
"Point-free" means combining functions without naming variables just to pass the output of one as the input to another; it does not mean "with no dots in the code". In Haskell, for example, you have to use dots to do it. — kaya3, Feb 18 '20 at 10:20
@kaya3 then it’s a quiet misleadingly named concept. How is anyone reading the term supposed to understand that “point” means “unnecessary temporary parameter variable”? — Holger, Feb 18 '20 at 12:17
@Holger They could understand that by having learned it, e.g. by reading a definition from a textbook or [Wikipedia](https://en.wikipedia.org/wiki/Tacit_programming). I don't think it is more misleading than other technical terms. As I understand it, "point" in this context means function argument, in the same sense as a "fixed point" of a function. — kaya3, Feb 18 '20 at 12:23
@kaya3 and why should anyone be required to learn the misleading term, when the thing already has a less misleading name, i.e. “Tacit Programming”? — Holger, Feb 18 '20 at 12:36
@Holger sorry, I called it point-free because thats how its usually referred to in Haskell. Usually it results in less code. Unfortunately in Java, you need to use casts to use methods like `Function.compose` on method references so it results in more code. Maybe it will be fixed in a future version of Java. — wilmol, Feb 18 '20 at 23:21
Never mind. Java is not an FP language and the limitations make different constructs preferable to a pragmatic programmer. It’s not only that Java doesn’t have an operator but a method, to invoke it, you need a receiver type for the method, which can not be inferred. Needing to write explicit types (with the verbose generic syntax) is what makes it so bloated then. You could fix it by writing your own `compose`/`andThen` method as a *static* method, then `reduce(myCompose(method::ref, method::ref))` would actually work. But `myCompose` itself would use again a lambda expression internally… — Holger, Feb 19 '20 at 08:29
The linked question is not a duplicate; it's about invariance of the parameter type for the BinaryOperator, i.e. the case BinaryOperator where U extends T. This question is about the BinaryOperator type vs. BiFunction. — kaya3, Feb 19 '20 at 08:38
@kaya3 the label doesn’t say that it is a duplicate but “This question already has answers here”. When you read the last part of the accepted answer, you’ll see that it is correct. — Holger, Feb 19 '20 at 08:43
@Holger Ah, I wasn't aware that that text didn't mean "duplicate". Thanks. — kaya3, Feb 19 '20 at 08:48

score 5 · Answer 1 · answered Feb 18 '20 at 00:57

5

The reduce operation must take arguments of the same type and return an identical type. If it didn't, there'd be a type mismatch. That's exactly what the BinaryOperator is: BinaryOperator<T> extends BiFunction<T,T,T>

Instead of using a lambda, you can create your BiFunction. Then create a BinaryOperator:

import java.util.function.BinaryOperator;
import java.util.function.BiFunction;
import java.util.function.Function;

import java.util.stream.Stream;

import com.google.common.collect.ImmutableSet;
import com.google.common.collect.Sets;

public class StackOverflowTest {
  public static void main(String[] args) {

    BiFunction<ImmutableSet<Integer>, ImmutableSet<Integer>, Sets.SetView<Integer>> f = Sets::intersection;
    Function<Sets.SetView<Integer>, ImmutableSet<Integer>> g = Sets.SetView::immutableCopy;

    BiFunction<ImmutableSet<Integer>, ImmutableSet<Integer>, ImmutableSet<Integer>> biFunction = f.andThen(g);

    BinaryOperator<ImmutableSet<Integer>> biOperator = biFunction::apply;

    ImmutableSet<Integer> intersection =
       Stream.of(ImmutableSet.of(1, 2, 3),
                 ImmutableSet.of(1, 2),
                 ImmutableSet.of(1, 4)) // added a 1
             .reduce(biOperator)
             .orElse(ImmutableSet.of());

    System.out.println(intersection);

/*
prints:
[1]
*/
  }
}

answered Feb 18 '20 at 00:57

Scratte

3,056
6
19
26

7

But this doesn't answer the question of why the API *demands* `BinaryOperator` when the more general `BiFunction` would work just fine, and the caller may provide a `BiFunction` via the convenience class `BinaryOperator` if they so choose, rather than being forced to use a subtype. This ignores the Liskov substitution principle. – Bohemian Feb 18 '20 at 01:58
1

@Bohemian No, you're right. It doesn't answer why Oracle has chosen this. I'd expect that answer would have to come from the library developers at Oracle. I'd rather not speculate. However, I did sense OP was disgruntled about using a lamdba instead of a function of some kind. I chose to provide with an answer to give at least that. – Scratte Feb 18 '20 at 02:07
6

@Bohemian it has been answered in [this old answer](https://stackoverflow.com/a/35685511/2711488). Well, the reason doesn’t need to be convincing. But at least, we know what the devs said about it… – Holger Feb 18 '20 at 10:06
@Holger I still remember a Russian presentation that I attended when there used to be `Block` in the Stream packages... – Eugene Feb 18 '20 at 14:32
2

@Eugene `Block` was the predecessor of `Consumer`. – Holger Feb 18 '20 at 14:52

kaya3 · Answer 2 · 2020-02-19T11:20:04.700

2

I don't think there is likely to be a satisfying answer to your actual question; the only benefit I can think of for taking the stricter type BinaryOperator<T> as an argument is readability, but that may or may not have been on the minds of the API designers. We will probably never know what the rationale for the decision was unless one of the people who made the decision writes an answer.

That said, in your particular case it seems like the Sets.SetView::immutableCopy function is unnecessary inside the reduction, because the Sets::intersection function doesn't require its arguments to be of type ImmutableSet<E>, so long as we specify that the stream values are the weaker type Set<E>. So the following should be logically equivalent:

ImmutableSet<Integer> intersection = 
  Stream.<Set<Integer>>of(/* the sets */)
        .reduce(Sets::intersection)
        .map(ImmutableSet::copyOf)
        .orElse(ImmutableSet.of());

There may be performance differences due to the fact that Sets::intersection returns a view, without doing any copying. If the intersections are likely to be large relative to the original sets, and the number of sets is not large, then this version should be more efficient due to doing less memory allocation and copying. Otherwise if the intersections are likely to be small, or the number of sets is large, then the copying could be beneficial since it's faster to iterate over a smaller set than a view of the intersection of two large sets.

That said, in the second case I would recommend writing this in the imperative style with a for loop, so you can stop early if the accumulator is already empty.

edited Feb 19 '20 at 11:20

answered Feb 18 '20 at 13:43

kaya3

47,440
4
68
97

Great idea. I just tried this using `.reduce(Sets::intersection)`. However it seems a cast is required, as it's giving the error `SetView cannot be converted to ImmutableSet` which would mean to indicate that the cast needs to happen inside the reduce. However I didn't research if the library has a method to give a `Set.SetView` from an `ImmutableSet`, but it doesn't appear so at first glance. It does work with `.map(e -> Sets.union(e,e))` prior to the reduce, but I think of it as a sort of a hack. – Scratte Feb 18 '20 at 15:14
@Scratte Thanks - I guess the cast required is in `Stream.>of(...)` so that the stream elements are `Set` instead of `ImmutableSet`, and then `Sets::intersection` can be inferred as type `BinaryOperator>`. Does that work? – kaya3 Feb 18 '20 at 15:18
1

Yes, it works :) I just added it as an edit. – Scratte Feb 18 '20 at 15:25
1

Thanks a lot for your help. I've removed the sentence about casting since the code is no longer untested. – kaya3 Feb 18 '20 at 15:26
1

I just noticed that I forgot to remove the extra ``. The `.reduce(Sets::intersection)` can be changed to `.reduce(Sets::intersection)`, as you pointed out. Sorry about the inconvenience. I already tested it. – Scratte Feb 19 '20 at 10:22
@Scratte I suspected it could be; thanks for testing. I guess it didn't cause any inconvenience, though! – kaya3 Feb 19 '20 at 11:21

Naman · Answer 3 · 2020-02-18T10:16:46.937

0

This is a lot simpler and relatable with the existing overload

Optional<T> reduce(BinaryOperator<T> accumulator)

used in the question and can be used as in:

ImmutableSet<Integer> intersection = Stream.of(ImmutableSet.of(1, 2, 3), ImmutableSet.of(1, 2), ImmutableSet.of(4))
        .reduce(biFunction::apply)
        .orElse(ImmutableSet.of());

edited Feb 18 '20 at 10:16

answered Feb 18 '20 at 04:44

Naman

27,789
26
218
353

Of course, the answer does not state the reason for choosing a specific signature to be exposed via the JDK implementation, but it does point out that for one or the other reason it could have been not much worth to provide such an inbuilt API. – Naman Feb 18 '20 at 04:57
@kaya3 Indeed the reduction with identity wouldn't work. Updated the answer to not include the overloaded version. Thanks for pointing out. – Naman Feb 18 '20 at 13:17

score 0 · Answer 4 · answered Feb 18 '20 at 08:23

0

This may be to clearly state the intent that the reduction should not be cumulative (see Reduction operations), and must be an operation not a function (too general). As an operation is usually thought as operating by value?

Naman's answer is also a good hint on that as reduce(biFunction::apply) can be very easily applied for.

answered Feb 18 '20 at 08:23

Jean-Baptiste Yunès

34,548
4
48
69

`BinaryOperator` is a subtype of `BiFunction`, so there is no such thing as a binary operator which is not a function. In this sense, "function" is the [mathematical term](https://en.wikipedia.org/wiki/Function_(mathematics)) for a mapping from input values to output values. – kaya3 Feb 18 '20 at 10:24
@kaya I wasn't talking about subtyping (I know the relationship). I was more on semantic overload of the names. operator seem to be more clearly related to what was intented, function is more ambiguous (two meanings : CS and Math). – Jean-Baptiste Yunès Feb 18 '20 at 10:50

Why does Stream.reduce take BinaryOperator rather than BiFunction?

4 Answers4