10

I would like to get an answer pointing out the reasons why the following idea described below on a very simple example is commonly considered bad and know its weaknesses.

I have a sentence of words and my goal is to make every second one to uppercase. My starting point for both of the cases is exactly the same:

String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");

The traditional and procedural approach is:

StringBuilder stringBuilder = new StringBuilder();
for (int i=0; i<split.length; i++) {
    if (i%2==0) {
        stringBuilder.append(split[i]);
    } else {
        stringBuilder.append(split[i].toUpperCase());
    }
    if (i<split.length-1) { stringBuilder.append(" "); }
}

When want to use the use is limited due the effectively-final or final variable constraint used in the lambda expression. I have to use the workaround using the array and its first and only index, which was suggested in the first comment of my question How to increment a value in Java Stream. Here is the example:

int index[] = {0};
String result = Arrays.stream(split)
    .map(i -> index[0]++%2==0 ? i : i.toUpperCase())
    .collect(Collectors.joining(" "));

Yeah, it's a bad solution and I have heard few good reasons somewhere hidden in comments of a question I am unable to find (if you remind me some of them, I'd upvote twice if possible). But what if I use AtomicInteger - does it make any difference and is it a good and safe way with no side effects compared to the previous one?

AtomicInteger atom = new AtomicInteger(0);
String result = Arrays.stream(split)
    .map(i -> atom.getAndIncrement()%2==0 ? i : i.toUpperCase())
    .collect(Collectors.joining(" "));

Regardless of how ugly it might look for anyone, I ask for the description of possible weaknesses and their reasons. I don't care the performance but the design and possible weaknesses of the 2nd solution.

Please, don't match AtomicInteger with multi-threading issue. I used this class since it receives, increments and stores the value in the way I need for this example.

As I often say in my answers that "Java Stream-API" is not the bullet for everything. My goal is to explore and find the edge where is this sentence applicable since I find the last snippet quite clear, readable and brief compared to StringBuilder's snippet.

Edit: Does exist any alternative way applicable for the snippets above and all the issues when it’s needed to work with both item and index while iteration using Stream-API?

Nikolas Charalambidis
  • 40,893
  • 16
  • 117
  • 183
  • "*I would like to get an objective technical review*" - [Code Review](https://codereview.stackexchange.com/). – Turing85 Jun 21 '18 at 21:42
  • 2
    [Stream processing should be free of side-effects](https://docs.oracle.com/javase/10/docs/api/java/util/stream/package-summary.html). Your solution, however, relies on a side-effect. What exactly is wrong with the "traditional" solution? – Turing85 Jun 21 '18 at 21:49
  • 2
    It's "wrong" in the same way as the previous snippet: good stream processing should give the same result whether the stream is sequantial or parallel. Both techniques will lead to random, undeterministic results if the stream is parallel. The proper way to do that with streams would be to map the elements of the array to a stream of objects containing an index and a string, and then to map each object to uppercase or lowercase depending on its index, and then to join. But a good old for loop is both simpler and faster. – JB Nizet Jun 21 '18 at 21:53
  • 4
    Or to use a stream of indices, and to map each index to the uppercase or lowercase corresponding string, and then join. – JB Nizet Jun 21 '18 at 21:57
  • That's how I would do that as well – `IntStream.range(0, split.length).mapToObj(i -> i % 2 == 0 ? split[i] : split[i].toUpperCase())...`. – MC Emperor Dec 23 '21 at 21:36

3 Answers3

15

The documentation of the java.util.stream package states that:

Side-effects in behavioral parameters to stream operations are, in general, discouraged, as they can often lead to unwitting violations of the statelessness requirement, as well as other thread-safety hazards.

[...]

The ordering of side-effects may be surprising. Even when a pipeline is constrained to produce a result that is consistent with the encounter order of the stream source (for example, IntStream.range(0,5).parallel().map(x -> x*2).toArray() must produce [0, 2, 4, 6, 8]), no guarantees are made as to the order in which the mapper function is applied to individual elements, or in what thread any behavioral parameter is executed for a given element.

This means that the elements may be processed out of order, and thus the Stream-solutions may produce wrong results.

This is (at least for me) a killer argument against the two Stream-solutions.

By the process of elimination, we only have the "traditional solution" left. And honestly, I do not see anything wrong with this solution. If we wanted to get rid of the for-loop, we could re-write this code using a foreach-loop:

boolean toUpper = false; // 1st String is not capitalized
for (String word : splits) {
    stringBuilder.append(toUpper ? word.toUpperCase() : word);
    toUpper = !toUpper;
}

For a streamified and (as far as I know) correct solution, take a look at Octavian R.'s answer.


Your question wrt. the "limits of streams" is opinion-based.

The answer to the question (s) ends here. The rest is my opinion and should be regarded as such.


In Octavian R.'s solution, an artificial index-set is created through a IntStream, which is then used to access the String[]. For me, this has a higher cognitive complexity than a simple for- or foreach-loop and I do not see any benefit in using streams instead of loops in this situation.

Turing85
  • 18,217
  • 7
  • 33
  • 58
6

In Java, comparing with Scala, you must be inventive. One solution without mutation is this one:

String sentence = "Hi, this is just a simple short sentence";
String[] split = sentence.split(" ");
String result = IntStream.range(0, split.length)
                         .mapToObj(i -> i%2==0 ? split[i].toUpperCase():split[i])
                         .collect(Collectors.joining(" "));
System.out.println(result);

In Java streams you should avoid the mutation. Your solution with AtomicInteger it's ugly and it's a bad practice.

Kind regards!

Octavian R.
  • 1,231
  • 8
  • 12
  • This does not answer OPs question. – Turing85 Jun 21 '18 at 22:03
  • 1
    @Turing85: It actually doesn’t but gives at least an idea of alternative solution based on applying Stream to the indicies instead. He practically answered another question before I edited this one to ask it. – Nikolas Charalambidis Jun 21 '18 at 22:10
  • @Nikolas You should limit yourself to one question per post. Editing posts and "adding" new question can make perfeclty valid answers invalid. – Turing85 Jun 21 '18 at 22:11
  • 1
    @Turing85: I note that. However, since it’s here and is tightly coupled with my question, I used it. I don’t want to repeat the core of my question again and spam SO. – Nikolas Charalambidis Jun 21 '18 at 22:14
  • @Nikolas If you had asked for a "nice, streamified solution" to your problem from the get-go, I would not have answered the question. Now I am glad I can reference Octavian R.'s answer, otherwise I might have deleted my answer due to its incompleteness. – Turing85 Jun 21 '18 at 22:26
4

As explained in Turing85’s answer, your stream solutions are not correct, as they rely on the processing order, which is not guaranteed. This can lead to incorrect results with parallel execution today, but even if it happens to produce the desired result with a sequential stream, that’s only an implementation detail. It’s not guaranteed to work.

Besides that, there is no advantage in rewriting code to use the Stream API with a logic that basically still is a loop, but obfuscated with a different API. The best way to describe the idea of the new APIs, is to say that you should express what to do but not how.

Starting with Java 9, you could implement the same thing as

String result = Pattern.compile("( ?+[^ ]* )([^ ]*)").matcher(sentence)
    .replaceAll(m -> m.group(1)+m.group(2).toUpperCase());

which expresses the wish to replace every second word with its upper case form, but doesn’t express how to do it. That’s up to the library, which likely uses a single StringBuilder instead of splitting into an array of strings, but that’s irrelevant to the application logic.

As long as you’re using Java 8, I’d stay with the loop and even when switching to a newer Java version, I would consider replacing the loop as not being an urgent change.

The pattern in the above example has been written in a way to do exactly the same as your original code splitting at single space characters. Usually, I’d encode “replace every second word” more like

String result = Pattern.compile("(\\w+\\W+)(\\w+)").matcher(sentence)
    .replaceAll(m -> m.group(1)+m.group(2).toUpperCase());

which would behave differently when encountering multiple spaces or other separators, but usually is closer to the actual intention.

Holger
  • 285,553
  • 42
  • 434
  • 765