12

Is there a 'best practice' for mutating elements within a Stream? I'm specifically referring to elements within the stream pipeline, not outside of it.

For example, consider the case where I want to get a list of Users, set a default value for a null property and print it to the console.

Assuming the User class:

class User {
    String name;

    static User next(int i) {
        User u = new User();
        if (i % 3 != 0) {
            u.name = "user " + i;
        }
        return u;
    }
}

In java 7 it'd be something along the lines of:

for (int i = 0; i < 7; i++) {
    User user = User.next(i);
    if(user.name == null) {
        user.name = "defaultName";
    }
    System.out.println(user.name);
}

In java 8 it would seem like I'd use .map() and return a reference to the mutated object:

IntStream.range(0, 7)
    .mapToObj(User::next)
    .map(user -> {
        if (user.name == null) {
            user.name = "defaultName";
        }
        return user;
    })
    //other non-terminal operations
    //before a terminal such as .forEach or .collect
    .forEach(it -> System.out.println(it.name));

Is there a better way to achieve this? Perhaps using .filter() to handle the null mutation and then concat the unfiltered stream and the filtered stream? Some clever use of Optional? The goal being the ability to use other non-terminal operations before the terminal .forEach().

In the 'spirit' of streams I'm trying to do this without intermediary collections and simple 'pure' operations that don't depend on side effects outside the pipeline.

Edit: The official Stream java doc states 'A small number of stream operations, such as forEach() and peek(), can operate only via side-effects; these should be used with care.' Given that this would be a non-interfering operation, what specifically makes it dangerous? The examples I've seen reach outside the pipeline, which is clearly sketchy.

Adam Bickford
  • 1,216
  • 12
  • 14

3 Answers3

11

Don't mutate the object, map to the name directly:

IntStream.range(0, 7)
    .mapToObj(User::next)
    .map(user -> user.name)
    .map(name -> name == null ? "defaultName" : name)
    .forEach(System.out::println);
assylias
  • 321,522
  • 82
  • 660
  • 783
4

It sounds like you're looking for peek:

.peek(user -> {
    if (user.name == null) {
        user.name = "defaultName";
    }
})

...though it's not clear that your operation actually requires modifying the stream elements instead of just passing through the field you want:

.map(user -> (user.name == null) ? "defaultName" : user.name)
Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • 8
    from the `peek` javadoc: “_This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline_” – Didier L Nov 17 '15 at 21:16
  • 2
    @DidierL: well, yes, that's because you're not really supposed to modify the elements in a stream pipeline. If you want to, though, this is how. – Louis Wasserman Nov 17 '15 at 21:34
  • So that's the true answer to the question: the best practice is to _not_ do that, as stated by @Tunaki in a comment. Doing it in a `forEach` would be fine though. – Didier L Nov 17 '15 at 21:38
  • @DidierL, there's *mainly*, not *exclusively*. See also Brian Goetz [comment](http://stackoverflow.com/questions/33557251/java-lambda-expression-mapping-and-then-modifying-a-list/33557322#comment54894478_33557322): he does not totally object about such use. – Tagir Valeev Nov 18 '15 at 01:50
  • 1
    You can perform a non-interfering modification, though it is not recommended. The problem with `peek`, [as explained here](http://stackoverflow.com/a/33636377/2711488) is that it doesn’t do anything on its own but gets executed as a byproduct of the actual terminal operation and depends on its semantics. Thus, if you want to perform the `peek` operation on *all elements* but combine it with a terminal operation that is short-circuiting or even predicts the final result without processing elements, you’re in trouble, esp. as it might work in some scenarios but fail in others. – Holger Nov 18 '15 at 08:51
  • @Holger It could still make sense when you're calling it as part of the pipeline transition rather than a hack to modify your objects while streaming. A simple example: `someObjects.stream().map(someObject -> convert(someObject)).peek(result -> result.modify()).collect(toList())`. Sure, you could combine `convert()` and `modify()` into a single `map()` operation, but this is cleaner and generally safe. – shmosel Aug 26 '16 at 00:52
  • @shmosel: I don’t see how `.map(…).peek(…)` is cleaner than `.map(……)`, but you are right in that this usage can’t contradict with the terminal operation’s semantics. – Holger Aug 26 '16 at 08:40
  • @Holger As a rule, I consider single expression lambdas to be cleaner than block lambdas. – shmosel Aug 26 '16 at 08:47
  • 1
    @shmosel: that’s still possible by letting the `convert` method do the conversion, or generally, if a lambda expression gets too big you can always put the code into a named method and use a method reference to that new method instead. That’s what **I** call cleaner… – Holger Aug 26 '16 at 08:53
2

It would seem that Streams can't handle this in one pipeline. The 'best practice' would be to create multiple streams:

List<User> users = IntStream.range(0, 7)
    .mapToObj(User::next)
    .collect(Collectors.toList());

users.stream()
    .filter(it -> it.name == null)
    .forEach(it -> it.name = "defaultValue");

users.stream()
    //other non-terminal operations
    //before terminal operation
    .forEach(it -> System.out.println(it.name));
Adam Bickford
  • 1,216
  • 12
  • 14
  • I wouldn't have thought of this on my own, but it really does seem like the best solution. – Tim M. Jul 21 '17 at 11:05
  • 4
    As others have already pointed out, this is a clear misuse of Streams. And hopefully people we would stop voting on this answer up as it encourages readers to think this is OK. Do not mutate on streams. As other have pointed out, if only the name is needed, use Stream.map. If other transformations are needed before getting the name, use an immutable User object, perform transformations, then change its name via the Stream.map method – ailveen Dec 18 '17 at 07:48
  • This only works for a subset of streams (ones that restart). As a general solution, streams shouldn't be considered restartable. – Thor Jan 13 '22 at 00:34
  • @Thor The stream is only read once per pipeline, users is a List. When I wrote this question I was new to FP and wanted to use them more like Groovy closures which isn't appropriate. Java just makes using them correctly very painful (lombok helps tremendously). – Adam Bickford Jan 26 '22 at 16:36
  • The advantage of a stream can be that don't have to load everything into memory. This solution kindof blows that out of the water as you're putting it all into a list again (all in memory), then streaming it again. – TheJeff Aug 11 '22 at 13:26