2

A solution that I came up on another Stackoverflow question that is using Stream.peek operation works but still seems like is not right because it mutates state in the Stream.peek method.

While researching (here and here) on Stream.peek usage whether it is ok to mutate state I am still not fully convinced that Stream.peek should not mutate state (including state of collection that is source of the Stream).

Here is what Javadoc says:

This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline:

And then:

Parameters: action - a non-interfering action to perform on the elements as they are consumed from the stream.

For well-behaved non-interfering stream sources, the source can be modified before the terminal operation commences and those modifications will be reflected in the covered elements.All the streams returned from JDK collections, and most other JDK classes, are well-behaved in this manner.

Seems like non-interfering action does includes changing the state of collection in the stream.

Here is the code that uses Stream.peek.

Map< String, List<Test> > userTests = new HashMap<>();

Map< String, List<Test> > filtered  = userTests.entrySet().stream()
        .peek( e -> e.setValue( modifyListAndReturnIt( e.getValue() ) ) )
        .filter( e -> !e.getValue().isEmpty() ) //check if modified list in peek has been emptied
        .collect( Collectors.toMap(p -> p.getKey(), p -> p.getValue() ) );

    public static List<Test> modifyListAndReturnIt(List<Test> list){
        if (somecondition) list.clear();
        return list;
    }

1) Can the above code have any side effect?

2) Why not use peek in such a way. The Javadoc does not seem to not allow it?

tsolakp
  • 5,858
  • 1
  • 22
  • 28
  • can you show the definition of the Streams source for completeness and the definition of the method being used within the Stream pipeline?. Also, as for your second question you’ll only need to ask your self _”why would the java doc state that the method is for debugging purposes only”_? Then ask your self if the implementation of the “peek” method changes, what will happen to my code? – Ousmane D. Nov 17 '17 at 18:52
  • 1
    My advice is that, you should always use a method the way it was intended to be used. Nothing more. – Ousmane D. Nov 17 '17 at 19:02
  • @Aominè. Actually the Javadoc does not state that is should "only" be used for debugging, It says "mostly" which is not same as "only". – tsolakp Nov 17 '17 at 19:38
  • You may want to read the rest of the java.util.stream documentation to which you’ve linked, particularly the [side effects section](https://docs.oracle.com/javase/9/docs/api/java/util/stream/package-summary.html#SideEffects). For instance, there are no guarantees “that different operations on the "same" element within the same stream pipeline are executed in the same thread.” So even if your code happens to work today, it may not work in a future Java release, or on a machine with different concurrency capabilities (like more CPU cores). – VGR Nov 17 '17 at 20:41
  • @VGR. The doc does not really say what they consider a side effect and where it applies. But it does mention this: ***A small number of stream operations, such as forEach() and peek(), can operate only via side-effects; these should be used with care.*** – tsolakp Nov 17 '17 at 20:47
  • In the absence of an explicit definition, I think you can assume any mutation operation is a side effect. Even if it’s thread-safe, you have no guarantee of its order of execution. – VGR Nov 17 '17 at 21:18

1 Answers1

3

What you seem to do looks harmless as Brian Goetz states in comment here.

Now the problem with peek is that if you do side effects inside it - you would expect these side effects to actually happen. So, suppose you would want to alter some property of some object like this:

myUserList.stream()
          .peek(u -> u.upperCaseName())
          .count()

In java-8 your peek would be indeed called, in 9 - it is not - there is no need to call peek here since the size can be computed without it anyway.

While being on the same path, imagine that your terminal operation is a short-circuit one, like findFirst or findAny - you are not going to process all elements of the source - you might get just a few of them through the pipeline.

Things might get even stranger if your peek would rely on a encounter order even if your terminal operation would not be a short-circuit one. The intermediate operations for parallel processing do not have an encounter order, while the terminal ones - do. Imagine the surprises you might be facing.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • now that I've written this and actually saw that you linked your question to this: https://stackoverflow.com/questions/33635717/in-java-streams-is-peek-really-only-for-debugging; it looks like and exact duplicate to me - even my answer looks like a summarization of that :| I will close it as a duplicate if there are more votes towards that – Eugene Nov 17 '17 at 22:12
  • Unfortunately there does not seem a clear answer to this. Some say dont use `peek` since it is not intended for this type of usage but some say it can be used but with great care. Seems like no general thumb of the rule can be defined about `peek` or even any other intermediate operation like `map` that can potentially change state too. – tsolakp Nov 17 '17 at 22:58
  • @tsolakp you're right in some way - I agree. In these cases I stick to the official documentation and what it says and it clearly states that side-effects are bad. Of course people might try to abuse this in some ways, it's entirely up to you if you want to be on that side – Eugene Nov 18 '17 at 12:47
  • @tsolakp the same thing is true for example when a `reduce` violates the `identity` for example - compiler can't "force" you to do the right thing... – Eugene Nov 18 '17 at 12:48