The normal reduction is meant to combine two immutable values such as int, double, etc. and produce a new one; it’s an immutable reduction. In contrast, the collect method is designed to mutate a container to accumulate the result it’s supposed to produce.
To illustrate the problem, let's suppose you want to achieve Collectors.toList()
using a simple reduction like
List<Integer> numbers = stream.reduce(
new ArrayList<Integer>(),
(List<Integer> l, Integer e) -> {
l.add(e);
return l;
},
(List<Integer> l1, List<Integer> l2) -> {
l1.addAll(l2);
return l1;
});
This is the equivalent of Collectors.toList()
. However, in this case you mutate the List<Integer>
. As we know the ArrayList
is not thread-safe, nor is safe to add/remove values from it while iterating so you will either get concurrent exception or ArrayIndexOutOfBoundsException
or any kind of exception (especially when run in parallel) when you update the list or the combiner tries to merge the lists because you are mutating the list by accumulating (adding) the integers to it. If you want to make this thread-safe you need to pass a new list each time which would impair performance.
In contrast, the Collectors.toList()
works in a similar fashion. However, it guarantees thread safety when you accumulate the values into the list. From the documentation for the collect
method:
Performs a mutable reduction operation on the elements of this stream using a Collector. If the stream is parallel, and the Collector is concurrent, and either
the stream is unordered or the collector is unordered, then a
concurrent reduction will be performed. When executed in parallel, multiple intermediate results may be instantiated, populated, and merged so as to maintain isolation of mutable data structures. Therefore, even when executed in parallel with non-thread-safe data structures (such as ArrayList), no additional synchronization is needed for a parallel reduction.
So to answer your question:
When would you use collect()
vs reduce()
?
if you have immutable values such as ints
, doubles
, Strings
then normal reduction works just fine. However, if you have to reduce
your values into say a List
(mutable data structure) then you need to use mutable reduction with the collect
method.