1

I am new to Java streams and functional programming and I wonder if my question will ever even have a valid use case or it is just not meaningful. I have the following code that gives me a Set from an array

    public static void main(String[] args) {
        String []arr = {"a","abc","b","cd"};
        Set<String> dict = new HashSet<>();
        Arrays.stream(arr).map(item->{
            System.out.println(item);
            dict.add(item);
            return dict;
        });

//        dict = Arrays.stream(arr).collect(Collectors.toSet());
        System.out.println(dict);
    }

This gives me an empty dict. If I do it the way I have in second last line, it works. Is there a way to mutate already instantiated dict that was instantiated outside stream that it gets populated in the map function?

curiousengineer
  • 2,196
  • 5
  • 40
  • 59
  • Here is an Oracle article on _[Processing Data with Java SE 8 Streams](https://www.oracle.com/technical-resources/articles/java/ma14-java-se-8-streams.html)_. – Reilas May 25 '23 at 05:40
  • As a note, in Java, the term _dictionary_ typically refers to a list of key-value pairs, and not a set. Additionally, the method `map`, here, is not related to a `Map` collection; it is used to _"... project the elements of a stream into another form."_. – Reilas May 25 '23 at 05:45

2 Answers2

5

Stream.map() is an intermediate operation, meaning it only stages a step of the pipeline, but it doesn't do anything when you call it. You need a terminal operation to put the stream to work. In your case, you can use forEachOrdered():

Arrays.stream(arr)
        .peek(System.out::println)
        .forEachOrdered(dict::add);

Note: I'm assuming there's a reason you need to modify an existing set rather than produce a new one. Otherwise, just use collect() as suggested in the other answer.

shmosel
  • 49,289
  • 6
  • 73
  • 138
  • This code violates the "no side-effects rule". – Nikolas Charalambidis May 25 '23 at 04:54
  • @NikolasCharalambidis `forEachOrdered()` exists to support side-effects – shmosel May 25 '23 at 04:59
  • 1
    Interesting, I didn't know it serves exactly such a purpose. Do you have a reference to any documentation supporting this statement? – Nikolas Charalambidis May 25 '23 at 05:33
  • @NikolasCharalambidis, the violation is a minimum, _"... Streams are lazy; computation on the source data is only performed when the terminal operation is initiated, and source elements are consumed only as needed."_, from _[Stream (Java SE 20 & JDK 20)](https://docs.oracle.com/en/java/javase/20/docs/api/java.base/java/util/stream/Stream.html)._ – Reilas May 25 '23 at 06:03
  • @NikolasCharalambidis It's logical; if it doesn't return anything, it must be acting on something. But you can find mention of it [here](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html) and [here](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html) with respect to `forEach()`. `forEachOrdered()` is only different in that it guarantees order and visibility, which solves half the issue of side-effects. – shmosel May 25 '23 at 06:04
  • The main place you want to avoid side-effects is in intermediate operations. Using `forEach()` and `forEachOrdered()` is more of a judgement call. Normally I'd prefer something like `toSet()` of course, but in a case where you already have an existing set that you want to update with a stream, terminal side-effects are perfectly appropriate. – shmosel May 25 '23 at 06:08
  • you can use `toSet()` to create a new set an *feed* it into the existing Set using `addAll` to be *more functional* - but that will create an additional instance of `Set` and would not increase readability (IMO) – user16320675 May 25 '23 at 06:20
  • @user16320675 Agreed. I would sooner replace the stream with a `for` loop. – shmosel May 25 '23 at 06:26
3
  1. Before the question got edited, the commented part didn't compile because dict must be either final or effectively final. See this answer for more information.
  2. One of the few rules of using Stream API is to avoid side effects. The JavaDoc for java.util.stream package describes it well with examples:

    Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards.

  3. The gist of each Stream is to return a new collection/value through operations such as filtering, mapping, reducing, or advanced collecting. It was never meant to modify an existing one. The correct way how to transform the array arr into a set is:
    Set<String> dict = Arrays.stream(arr)
          .peek(item -> System.out.println(item)) // optional for debugging
          .collect(Collectors.toSet());
    
Nikolas Charalambidis
  • 40,893
  • 16
  • 117
  • 183
  • why use peek here? Because in my original code, I have a println? – curiousengineer May 25 '23 at 05:05
  • 1
    1. does not compile? was the question edited (in the first 5 minutes)? Posted code is valid java code and should compile. || 2. "in general discouraged" is not exactly a "rule" || 3. "return **a new collection/value** is just one use case – user16320675 May 25 '23 at 05:29
  • @user16320675 1) The question was edited. 2) It does not justify its usage either. 3) Name more? – Nikolas Charalambidis May 25 '23 at 05:29
  • @curiousengineer Yes, exactly. Feel free to move it away. – Nikolas Charalambidis May 25 '23 at 05:31
  • @user16320675 1) I have edited my answer. Remember I can't check each minute what the OP changed in the code to keep my wording 100% correct. The gist is clear: One cannot use non-(effectively) final variable in a lambda expression. It does not matter whether the `dict` assignment happens before or after the lambda expression definition as in both cases it strips the effectively-final characteristics from `dict`. 2) + 3) You still didn't provide a legit use-case when the procedural approach in Stream API shall be used instead of the declarative way. – Nikolas Charalambidis May 25 '23 at 05:40
  • 1) exactly the reason ("can't check each minute") why I added the comment to start with... 2+3) Why should I?It is not about justifying its usage, nor about being legit (very opinion-based - for me posted question can be a legit use-case - it may not be in a pure functional language, but Java isn't such) – user16320675 May 25 '23 at 05:48
  • Furthermore, the term _closure_, and the concept of a functional-paradigm is to reduce mutating; or in this case re-assignment. – Reilas May 25 '23 at 05:53
  • Note that `peek()` also operates via side-effect by definition. But that's fine here; there's no hard rule against side-effects, as @user16320675 explained. – shmosel May 25 '23 at 06:16