9

I have a function that creates an int stream, finds the distinct characters, sorts them and then collects them into a new list and then creates a string. Below is the function.

public static String longest(String s1, String s2) {
    String s = s1 + s2;
    return s.chars()
            .distinct()
            .sorted()
            .collect(StringBuilder::new, StringBuilder::appendCodePoint, StringBuilder::append)
            .toString();
}

I am really struggling to work out how the collect with the StringBuilder is working, I have searched online and the Java docs but can't make any sense of it. From what I can make out, it creates a new instance of StringBuilder and just appends each character in the stream, can anyone give a better explanation? Thank you

pocockn
  • 1,965
  • 5
  • 21
  • 36
  • 1
    Read the tutorial: https://docs.oracle.com/javase/tutorial/collections/streams/reduction.html#collect – JB Nizet Dec 04 '16 at 19:54
  • 2
    It is essential to know what exactly is troubling you. Is it the use of `collect` with 3 parameters? Is it the method-references `StringBuilder::new` and others? – Tunaki Dec 04 '16 at 19:54
  • Yeah it's the 3 collect parameters that are confusing me – pocockn Dec 04 '16 at 19:57
  • 3
    And did you see the [javadoc of the method](https://docs.oracle.com/javase/8/docs/api/java/util/stream/IntStream.html#collect-java.util.function.Supplier-java.util.function.ObjIntConsumer-java.util.function.BiConsumer-)? which also points to [this paragraph about mutable reduction](https://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#MutableReduction)? – Tunaki Dec 04 '16 at 19:59
  • By the way, your code fails with most characters. The `char` type has been essentially broken since Java 2, and legacy since Java 5. As a 16-bit value, the `char` is physically incapable of representing most of the over 144,000 characters defined in a Unicode. Replace your `.chars` call with [`.codePoints`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html#codePoints()). – Basil Bourque Mar 21 '23 at 14:48

2 Answers2

16

To understand the three arguments, you need to understand what the stream needs to do: it loops through characters, and must append them to a StringBuilder.

So the first thing it needs to know is how to create an empty StringBuilder. That's what the first argument is for: it provides a function which, when called by the stream, creates an empty StringBuilder.

The second thing it needs to know is what to do with each character in the stream. It must append them to the StringBuilder. That's what the second argument is for: it's a function which, when called by the stream, appends the character to the StringBuilder.

That's all you need if the stream is sequential. But if the stream is parallel, the stream splits the elements in several parts, and processes each part in parallel. Let's say it just uses two parts. It calls the first function twice to create two empty StringBuilders, and it processes each part in parallel by using the second function to append characters to the two StringBuilders.

In the end, each part is transformed to a StringBuilder containing half of the characters. So the stream needs to know how to combine those two StringBuilders together. That's what the third argument is for. It's a function which, when called by the Stream, combines the two StringBuilder together by appending all the characters from the second one to the first one.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
15

Argument 1: Creates your starting result (in this case, your new StringBuilder).

Argument 2: Adds an element (String) to your result (StringBuilder).

Argument 3: If you run the stream in parallel, multiple StringBuilders will be created. This is for combining these together.

Joe C
  • 15,324
  • 8
  • 38
  • 50