Why my string manipulation is slow using lambda expression?

Question

A method takes comma separated words as a String and returns a String of comma separated words with the words in natural sort order, not containing any 4 letter words, contain all words in UPPER case and no duplicates. The 1st approach is quite slow in comparison to the 2nd approach. Can you help me understand why and how can I improve my approach?

Approach 1:

public String stringProcessing(String s){
      Stream<String> tokens = Arrays.stream(s.split(","));
      return tokens.filter(t -> t.length() != 4) .distinct()
                   .sorted() 
                   .collect(Collectors.joining(",")).toUpperCase();
}

Approach 2:

public String processing(String s) {
    String[] tokens = s.split(",");
    Set<String> resultSet = new TreeSet<>();
    for(String t:tokens){
        if(t.length() !=  4)
            resultSet.add(t.toUpperCase());
    }        
    StringBuilder result = new StringBuilder();
    resultSet.forEach(key -> {
        result.append(key).append(","); 
    });
    result.deleteCharAt(result.length()-1);
    return result.toString();
}

*The approach 1 is quite slow in comparison approach* how did u measure? — Eugene, Jan 30 '18 at 10:38
Instant start = Instant.now(); String s = stringProcessing(buffer.toString()); Instant end = Instant.now(); Duration elapsed = Duration.between(start, end); System.out.println("Total time for java8: " + elapsed.getNano()); — Kumar Manish, Jan 30 '18 at 10:40
you are measuring cold start like that... look at `JMH` for instance on how to correctly measure — Eugene, Jan 30 '18 at 10:40
[How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/q/504103) I wouldn't be surprised if the second one is faster for few elements and the first one is faster for many elements (that's not uncommon when you have two algorithms that work very differently under the hood, which is the case here). — Bernhard Barker, Jan 30 '18 at 11:20
you can use `java.util.regex.Pattern#splitAsStream` instead of `stream(s.split(","))`, maybe it will result your stream style code faster if your input data is large. — holi-java, Jan 30 '18 at 11:25
also a side note: sorting using natural String order, will be messed up, if you then do `toUpperCase()` on the result, as natural order is: `A` < ... < `Z` < `a` < ... < `z` — diginoise, Jan 30 '18 at 11:26
For better efficiency you should call `.distinct()` *after* `.sorted()`, not before it. — DodgyCodeException, Jan 30 '18 at 13:56

Holger · Answer 1 · 2018-01-30T14:24:33.237

A performance comparison without documenting the used JRE version, input data sets nor benchmark methodology is not suitable to draw any conclusions.

Further, there are fundamental differences between your variants. You first variant processes the original strings when using distinct(), potentially keeping much more elements than the second variant, joins all of them to a string, before transforming the complete result string to upper case. In contrast, your second variant transforms individual elements before adding to the set, so only strings with a distinct upper case representation are processed further. So the second variant may need significantly less memory and process less elements when joining.

So when doing entirely different things, there is not much sense in comparing the performance of these operations. A better comparison would be between these two variants:

public String variant1(String s){
    Stream<String> tokens = Arrays.stream(s.split(","));
    return tokens.filter(t -> t.length() != 4)
                 .map(String::toUpperCase)
                 .sorted().distinct()
                 .collect(Collectors.joining(","));
}

public String variant2(String s) {
    String[] tokens = s.split(",");
    Set<String> resultSet = new TreeSet<>();
    for(String t:tokens){
        if(t.length() !=  4)
            resultSet.add(t.toUpperCase());
    }
    return String.join(",", resultSet);
}

Note that I changed the order of sorted() and distinct(); as discussed in this answer, applying distinct() directly after sorted() allows to exploit the sorted nature of the stream within the distinct operation.

You may also consider not creating a temporary array holding all substrings before streaming over them:

public String variant1(String s){
    return Pattern.compile(",").splitAsStream(s)
            .filter(t -> t.length() != 4)
            .map(String::toUpperCase)
            .sorted().distinct()
            .collect(Collectors.joining(","));
}

You may also add a third variant,

public String variant3(String s) {
    Set<String> resultSet = new TreeSet<>();
    int o = 0, p;
    for(p = s.indexOf(','); p>=0; p = s.indexOf(',', o=p+1)) {
        if(p-o == 4) continue;
        resultSet.add(s.substring(o, p).toUpperCase());
    }
    if(s.length()-o != 4) resultSet.add(s.substring(o).toUpperCase());
    return String.join(",", resultSet);
}

which doesn’t create an array of substrings and even skips the substring creation for those not matching the filter criteria. This isn’t meant to suggest to go such low level in production code, but that there always might be a faster variant, so it doesn’t matter whether the variant you’re using is the fastest, but rather whether it runs at reasonable speed while being maintainable.

Eugene · Answer 2 · 2018-01-30T14:31:19.527

I guess it was only about time before some actually posted some JMH tests. I took Holger's methods and tested them:

@BenchmarkMode(value = { Mode.AverageTime })
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class StreamVsLoop {

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(StreamVsLoop.class.getSimpleName())
                .build();
        new Runner(opt).run();
    }

    @Param(value = {
            "a, b, c",
            "a, bb, ccc, dddd, eeeee, ffffff, ggggggg, hhhhhhhh",
            "a, bb, ccc, dddd, eeeee, ffffff, ggggggg, hhhhhhhh, ooooooooo, tttttttttttttt, mmmmmmmmmmmmmmmmmm" })
    String s;

    @Benchmark
    @Fork(1)
    public String stream() {
        Stream<String> tokens = Arrays.stream(s.split(","));
        return tokens.filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .sorted().distinct()
                .collect(Collectors.joining(","));
    }

    @Benchmark
    @Fork(1)
    public String loop() {
        String[] tokens = s.split(",");
        Set<String> resultSet = new TreeSet<>();
        for (String t : tokens) {
            if (t.length() != 4) {
                resultSet.add(t.toUpperCase());
            }
        }
        return String.join(",", resultSet);
    }

    @Benchmark
    @Fork(1)
    public String sortedDistinct() {
        return Pattern.compile(",").splitAsStream(s)
                .filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .sorted()
                .distinct()
                .collect(Collectors.joining(","));
    }

    @Benchmark
    @Fork(1)
    public String distinctSorted() {
        return Pattern.compile(",").splitAsStream(s)
                .filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .distinct()
                .sorted()
                .collect(Collectors.joining(","));
    }
}

And here are the results:

 stream              3 args         574.042
 loop                3 args         393.364
 sortedDistinct      3 args         829.077
 distinctSorted      3 args         836.558

 stream              8 args         1144.488
 loop                8 args         1014.756
 sortedDistinct      8 args         1533.968
 distinctSorted      8 args         1745.055

 stream             11 args         1829.571
 loop               11 args         1514.138
 sortedDistinct     11 args         1940.256
 distinctSorted     11 args         2591.715

The results are sort of obvious, streams are slower, but not that much, probably the readability wins. Also, Holger is right (but he rarely, if ever, isn't)

It would be interesting to see how the "sensibly coded" streams version would compare (using `splitAsStream()`, calling `toUpperCase()` before `sorted()` and calling `sorted()` before `distinct()`). — DodgyCodeException, Jan 30 '18 at 13:59
It would be interesting to see the comparison with a parallelStream. — Chris311, Jan 30 '18 at 14:09
Still too artificial. Note that your input is pre-sorted. Further, there should be some more variance regarding the fraction that gets rejected by the filter. — Holger, Jan 30 '18 at 14:28
The names in the output don't seem to correspond with the method names. — DodgyCodeException, Jan 30 '18 at 14:29
@Holger I thought about that, but that would require a `@State` class, with random letters probably. I started this answer when I had time, and ended up doing 3 things at once, which Im not good at :( — Eugene, Jan 30 '18 at 14:31
With 11 elements, it shouldn't really matter one way or the other for most purposes. This benchmark would be more meaningful if you try thousands of elements, or more. — Bernhard Barker, Jan 30 '18 at 18:49
@Holger I agree on artificial, took me some time.. but I think these make better sense https://stackoverflow.com/a/48624167/1059372 — Eugene, Feb 05 '18 at 13:55

Eugene · Answer 3 · 2018-02-05T20:58:25.643

It took me a little bit to build a test that I would be fairly comfortable with; to actually judge the numbers I would get...

@BenchmarkMode(value = { Mode.AverageTime })
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@Warmup(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 2, time = 2, timeUnit = TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class StreamVsLoop {

    public static void main(String[] args) throws RunnerException {
        Options opt = new OptionsBuilder().include(StreamVsLoop.class.getSimpleName())
                .jvmArgs("-ea")
                .shouldFailOnError(true)
                .build();
        new Runner(opt).run();
    }

    @State(Scope.Thread)
    public static class StringInput {

        private String[] letters = { "q", "a", "z", "w", "s", "x", "e", "d", "c", "r", "f", "v", "t", "g", "b",
                "y", "h", "n", "u", "j", "m", "i", "k", "o", "l", "p" };

        public String s = "";

        @Param(value = { "1000", "10000", "100000" })
        int next;

        @TearDown(Level.Iteration)
        public void tearDown() {
            if (next == 1000) {
                long count = Arrays.stream(s.split(",")).filter(x -> x.length() == 5).count();
                assert count == 99;
            }

            if (next == 10000) {
                long count = Arrays.stream(s.split(",")).filter(x -> x.length() == 5).count();
                assert count == 999;
            }

            if (next == 100000) {
                long count = Arrays.stream(s.split(",")).filter(x -> x.length() == 5).count();
                assert count == 9999;
            }
            s = null;
        }

        /**
         * a very brute-force tentative to have 1/2 elements to be filtered and 1/2 not
         * highly inneficiant, but this is not part of the measurment, so who cares?
         */
        @Setup(Level.Iteration)
        public void setUp() {

            for (int i = 0; i < next; i++) {
                int index = ThreadLocalRandom.current().nextInt(0, letters.length);
                String letter = letters[index];
                if (next == 1000) {
                    if (i < 500 && i % 4 == 0) {
                        s = s + "," + letter;
                    } else if (i > 500 && i % 5 == 0) {
                        s = s + "," + letter;
                    } else {
                        s = s + letter;
                    }

                } else if (next == 10000) {
                    if (i < 5000 && i % 4 == 0) {
                        s = s + "," + letter;
                    } else if (i > 5000 && i % 5 == 0) {
                        s = s + "," + letter;
                    } else {
                        s = s + letter;
                    }
                } else if (next == 100000) {
                    if (i < 50000 && i % 4 == 0) {
                        s = s + "," + letter;
                    } else if (i > 50000 && i % 5 == 0) {
                        s = s + "," + letter;
                    } else {
                        s = s + letter;
                    }
                }
            }
        }
    }

    @Benchmark
    @Fork
    public String stream(StringInput si) {
        Stream<String> tokens = Arrays.stream(si.s.split(","));
        return tokens.filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .sorted().distinct()
                .collect(Collectors.joining(","));
    }

    @Benchmark
    @Fork(1)
    public String loop(StringInput si) {
        String[] tokens = si.s.split(",");
        Set<String> resultSet = new TreeSet<>();
        for (String t : tokens) {
            if (t.length() != 4) {
                resultSet.add(t.toUpperCase());
            }
        }
        return String.join(",", resultSet);
    }

    @Benchmark
    @Fork(1)
    public String sortedDistinct(StringInput si) {
        return Pattern.compile(",").splitAsStream(si.s)
                .filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .sorted()
                .distinct()
                .collect(Collectors.joining(","));
    }

    @Benchmark
    @Fork(1)
    public String distinctSorted(StringInput si) {
        return Pattern.compile(",").splitAsStream(si.s)
                .filter(t -> t.length() != 4)
                .map(String::toUpperCase)
                .distinct()
                .sorted()
                .collect(Collectors.joining(","));
    }

    @Benchmark
    @Fork(1)
    public String variant3(StringInput si) {
        String s = si.s;
        Set<String> resultSet = new TreeSet<>();
        int o = 0, p;
        for (p = s.indexOf(','); p >= 0; p = s.indexOf(',', o = p + 1)) {
            if (p - o == 4) {
                continue;
            }
            resultSet.add(s.substring(o, p).toUpperCase());
        }
        if (s.length() - o != 4) {
            resultSet.add(s.substring(o).toUpperCase());
        }
        return String.join(",", resultSet);
    }
}

streamvsLoop.StreamVsLoop.distinctSorted    1000   0.028
streamvsLoop.StreamVsLoop.sortedDistinct    1000   0.024
streamvsLoop.StreamVsLoop.loop              1000   0.016
streamvsLoop.StreamVsLoop.stream            1000   0.020 
streamvsLoop.StreamVsLoop.variant3          1000   0.012


streamvsLoop.StreamVsLoop.distinctSorted   10000   0.394
streamvsLoop.StreamVsLoop.sortedDistinct   10000   0.359
streamvsLoop.StreamVsLoop.loop             10000   0.274
streamvsLoop.StreamVsLoop.stream           10000   0.304  ± 0.006
streamvsLoop.StreamVsLoop.variant3         10000   0.234


streamvsLoop.StreamVsLoop.distinctSorted  100000   4.950
streamvsLoop.StreamVsLoop.sortedDistinct  100000   4.432
streamvsLoop.StreamVsLoop.loop            100000   5.457
streamvsLoop.StreamVsLoop.stream          100000   3.927 ± 0.048
streamvsLoop.StreamVsLoop.variant3        100000   3.595

Holger's method wins, but boy is the difference small between the other solutions, once the code is hot enough.

Why my string manipulation is slow using lambda expression?

3 Answers3

Linked