3

I have a teammate that did all his stream operations in a foreach execution. I tried to come up with advantages of using the filter, map and to list methods instead.

What are other advantages it has other than readability (arguably the foreach is pretty readable as well) Here is a snippet for something close to what he did:

List<Integer> firstInts = new ArrayList<>();
List<Integer> secondInts = new ArrayList<>();
List<Integer> numbersList = IntStream.range(0, max).boxed().toList();


//his stream
numbersList.stream()
       .forEach(i -> {
           if(i % 6 != 0) {
               return;
           }
           secondInts.add(i);
       });


//alternative 1
numbersList.stream()
        .filter(i -> i % 6 == 0)
        .forEach(firstInts::add);


//alternative 2
List<Integer> third = numbersList.stream()
        .filter(i -> i % 6 == 0)
        .toList();

what is the motivation of using stream methods other than readability?

Lior Derei
  • 154
  • 12
  • 3
    Having side effects in stream methods, e.g. mutating the secondsInt state is frowned upon / not a good idea. It works here because you have a terminal operation and no parallelism but it is still not a good idea. – luk2302 Apr 16 '23 at 08:26
  • the whole idea behind for each is doing side effect operations (it's not part of pure functional programming theory). side effect in general should be avoided, but I can't find a strong argument why use one and not the other in this example – Lior Derei Apr 16 '23 at 08:29
  • The major advantage is the possible parallelism. Everything else is just opinion. – user207421 Apr 16 '23 at 09:46
  • map/filter are composable and lazy. forEach is eager and cannot be composed. Heavily opinion-based. – knittl Apr 16 '23 at 09:54
  • [Duplicate?](https://stackoverflow.com/questions/76004826/java-stream-filter-performance) – DuncG Apr 16 '23 at 16:22

3 Answers3

2

Alternative 2 looks best to me in this case

1. Readability

  • Alternative 2 has less number of lines
  • Alternative 2 read more close to return a list containing number divisible by 6 from numList, while forEach approach means add number divisible by 6 from numList to secondInts
  • filter(i -> i % 6 == 0) is straight forward and
    if(i % 6 != 0) {
        return;
    }
    
    require some time for human brain to process.

2. Performance

From Stream.toList()

Implementation Note: Most instances of Stream will override this method and provide an implementation that is highly optimized compared to the implementation in this interface.

We benefit from optimization from JDK by using Stream API.

And in this case, using forEach and adding element one by one will be slower, especially when the list is large. It is because ArrayList will need to extend it capacity whenever the list full, while Stream implementation ImmutableCollections.listFromTrustedArrayNullsAllowed just store the result array into ListN.

One more point to note about parallelism:
From Stream#forEach

The behavior of this operation is explicitly nondeterministic. For parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization.

numbersList.stream().parallel()
       .forEach(i -> {
           if(i % 6 != 0) {
               return;
           }
           secondInts.add(i);
       });

Will provide unexpected result, while

List<Integer> third = numbersList.stream().parallel()
      .filter(i -> i % 6 == 0).sorted().forEach()
      .toList();

is totally fine.

3. Flexibility

Imagine you want the filtered list to be sorted, in forEach approach, you can do it like:

numbersList.stream().sorted().
       .forEach(i -> {
           if(i % 6 != 0) {
               return;
           }
           secondInts.add(i);
       });

Which is much slower compared to

numbersList.stream()
        .filter(i -> i % 6 == 0)
        .sorted()
        .toList();

As we need to sort the whole numbersList instead of filtered.

Or if you want to limit your result to 10 elements, it is not straight forward to do so with forEach, but just as simple as adding limit(10) when using stream.

4. Less error prone

Stream API usually return Immutable object by default.

From Stream.toList()

Implementation Requirements: The implementation in this interface returns a List produced as if by the following: Collections.unmodifiableList(new ArrayList<>(Arrays.asList(this.toArray())))

Meaning that the returned list is immutable by default. Some advantages of immutability are:

  1. You can safely pass the list around to different method without worrying the list is modified.
  2. Immutable list are thread safe.

Read Pros. / Cons. of Immutability vs. Mutability for further discussion.

samabcde
  • 6,988
  • 2
  • 25
  • 41
1

I don’t think it’s an issue of advantages. Each mechanism has a specific purpose.

.forEach() returns void so it doesn’t have an output. The intent is that the elements that the forEach iterates through are not modified. The data in the elements are used for some sort of calculation. I find that forEach is used much less than map. It’s a terminal point in a pipeline.

.filter() takes a stream as input and emits a filtered stream as output. It is for filtering.

.map() is like forEach but it emits a stream of modified objects. It allows the same modification to be done on each each element so that it can be saved, filtered or manipulated further.

.toList is a handy shortcut to turn a stream into a list. Using forEach(List::add) where a toList() will do the work is a terrible idea. You’re preventing Java from bulking the activity.

John Williams
  • 4,252
  • 2
  • 9
  • 18
0

Streams provide for more readable, functional solution. But that does come at a cost since the stream API is complex and does a lot under the hood.

If one is going to do everything in a forEach block I would just skip the stream solution and go for an imperative solution using a for loop.

Otherwise, I would go with alternative 2 for the reasons cited (lends itself to parallelism, easier to read, etc). I seldom use forEach for anything other than printing although there are always exceptions.

And whether its safe to modify external state in some cases (as in alternative 1) I just presume it won't be safe and, with that in mind, focus on the a solution (functional or imperative) that provides the desired result.

WJS
  • 36,363
  • 4
  • 24
  • 39