Java 8 streams vs iterator performance

Question

I'm comparing 2 ways to filter lists, with and without using streams. It turns out that the method without using streams is faster for a list of 10,000 items. I'm interested in understanding why is it so. Can anyone explain the results please?

public static int countLongWordsWithoutUsingStreams(
        final List<String> words, final int longWordMinLength) {
    words.removeIf(word -> word.length() <= longWordMinLength);

    return words.size();
}

public static int countLongWordsUsingStreams(final List<String> words, final int longWordMinLength) {
    return (int) words.stream().filter(w -> w.length() > longWordMinLength).count();
}

Microbenchmark using JMH:

@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsWithoutUsingStreams() {
    countLongWordsWithoutUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}

@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsUsingStreams() {
    countLongWordsUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}

public static void main(String[] args) throws RunnerException {
    final Options opts = new OptionsBuilder()
        .include(PracticeQuestionsCh8Benchmark.class.getSimpleName())
        .warmupIterations(5).measurementIterations(5).forks(1).build();

    new Runner(opts).run();
}

java -jar target/benchmarks.jar -wi 5 -i 5 -f 1

Benchmark
Mode Cnt Score Error Units
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsUsingStreams thrpt 5 10.219 ± 0.408 ops/ms
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsWithoutUsingStreams thrpt 5 910.785 ± 21.215 ops/ms

Edit: (as someone deleted the update posted as an answer)

public class PracticeQuestionsCh8Benchmark {
    private static final int NUM_WORDS = 10000;
    private static final int LONG_WORD_MIN_LEN = 10;

    private final List<String> words = makeUpWords();

    public List<String> makeUpWords() {
        List<String> words = new ArrayList<>();
        final Random random = new Random();

        for (int i = 0; i < NUM_WORDS; i++) {
            if (random.nextBoolean()) {
                /*
                 * Do this to avoid string interning. c.f.
                 * http://en.wikipedia.org/wiki/String_interning
                 */
                words.add(String.format("%" + LONG_WORD_MIN_LEN + "s", i));
            } else {
                words.add(String.valueOf(i));
            }
        }

        return words;
    }

    @Benchmark
    @BenchmarkMode(AverageTime)
    @OutputTimeUnit(MILLISECONDS)
    public int benchmarkCountLongWordsWithoutUsingStreams() {
        return countLongWordsWithoutUsingStreams(words, LONG_WORD_MIN_LEN);
    }

    @Benchmark
    @BenchmarkMode(AverageTime)
    @OutputTimeUnit(MILLISECONDS)
    public int benchmarkCountLongWordsUsingStreams() {
        return countLongWordsUsingStreams(words, LONG_WORD_MIN_LEN);
    }
}
public static int countLongWordsWithoutUsingStreams(
    final List<String> words, final int longWordMinLength) {
    final Predicate<String> p = s -> s.length() >= longWordMinLength;

    int count = 0;

    for (String aWord : words) {
        if (p.test(aWord)) {
            ++count;
        }
    }

    return count;
}

public static int countLongWordsUsingStreams(final List<String> words,
    final int longWordMinLength) {
    return (int) words.stream()
    .filter(w -> w.length() >= longWordMinLength).count();
}

You need to return the value: `return countLongWordxxx();` from your benchmark methods. — assylias, Feb 06 '15 at 09:46
Guys, instead of playing a guessing game in this post, would anyone please *profile* both benchmarks and figure out *why* those are different? This is not House M.D., you don't need to conduct more experiments to produce the differential diagnosis, you can actually vivisect both tests and see how they are different. Performance profiling should be a basic skill, go! — Aleksey Shipilev, Feb 06 '15 at 22:03
When making comparisons, it's also helpful to compare things that are equivalent. The streams code actually has to process and count items, but asking the collection returned by `Collections.nCopies(n, obj)` for its size will simply return n! — Stuart Marks, Feb 07 '15 at 06:11
@AbhijitSarkar The edit improves things, but my point still stands. Others pointed out some problems with the original, including `removeIf` not working on an immutable collection, and not returning a value from the benchmark. Even if these were fixed, my point about `size()` doing different work from counting makes the comparison invalid. Now, the edit avoids `nCopies` and fixes these problems, but there are *still* differences between the two benchmarks. — Stuart Marks, Feb 07 '15 at 23:56
@AbhijitSarkar Specifically, one assigns a lambda to a local variable whereas the other uses it inline; and one counts using `int` values whereas the other counts using `long` values. It seems unlikely to me that these would make a significant difference, but the point is, we don't know, and they affect the validity of the comparison. Only when you have two benchmarks whose *only* difference is what you're measuring can you proceed with the analysis that Aleksey Shipilev suggested. — Stuart Marks, Feb 07 '15 at 23:59

Misha · Answer 1 · 2015-02-06T03:36:48.933

5

Whenever your benchmark says that some operation over 10000 elements takes 1ns (edit: 1µs), you probably found a case of clever JVM figuring out that your code doesn't actually do anything.

Collections.nCopies doesn't actually make a list of 10000 elements. It makes a sort of a fake list with 1 element and a count of how many times it's supposedly there. That list is also immutable, so your countLongWordsWithoutUsingStreams would throw an exception if there was something for removeIf to do.

edited Feb 06 '15 at 03:36

answered Feb 06 '15 at 01:54

Misha

27,433
6
62
78

Your comment about immutable list is right, I'll change the code so that there's actually something to remove and see what happens. As far as JVM optimizations go, it still has to iterate over the list to know the length of each element, right? BTW, where do you see 1ns, I see 10.219 ms and 910.785 ms? – Abhijit Sarkar Feb 06 '15 at 03:08
1

You are right. That's 1 microsecond, not nanosecond. Still way too fast to actually do 10000 of anything. – Misha Feb 06 '15 at 03:27
2

@Abhijit Sarkar: First of all, you are not using the result value, therefore the JVM might optimize away the entire computation. Second, it’s heavily implementation-dependent whether it actually need to iterate. Both methods may use the knowledge that they actually consist of a single element to perform only one predicate test. However, since the `nCopies` list doesn’t support mutation, it’s unlikely to have an optimized `removeIf`. But if you are going to use a real mutable list, it’s clear that mutating a `List` is more expensive than simply counting the matches. – Holger Feb 06 '15 at 11:20
@Holger Makes sense but oddly that's not what is happening, see my updated response below. – Abhijit Sarkar Feb 06 '15 at 16:44
2

@Abhijit Sarkar: You seemed to have missed the last sentence of my comment: “*But if you are going to use a real mutable list, it’s clear that mutating a List is more expensive than simply counting the matches.*” That’s especially true for `ArrayList` where removing isn’t cheap. So what’s surprising here? Using `removeIf(…)` to count occurrences is a bad idea. Why don’t you compare with an ordinary counting loop? – Holger Feb 06 '15 at 17:22
@Holger here's my implementation, streams still better (score 0.209 vs 0.355) public static int countLongWordsWithoutUsingStreams( final List words, final int longWordMinLength) { final LongAdder count = new LongAdder(); words.forEach(s -> { if (s.length() >= longWordMinLength) { count.increment(); } }); return count.intValue(); } – Abhijit Sarkar Feb 06 '15 at 17:48
@Holger I tested with your implementation too, same relative result. Have you tried running the tests yourself? – Abhijit Sarkar Feb 06 '15 at 18:31
@Abhijit Sarkar: I have no problems with streams being faster in some scenarios. You may want to read [this statement](http://stackoverflow.com/questions/22658322/java-8-performance-of-streams-vs-collections#comment36791431_22670380) – Holger Feb 06 '15 at 18:35
@Holger That thread doesn't offer much explanation of why's one faster than the other, which's what I'm looking for. The top answer essentially rewrites the original code and comes up with a conclusion exactly opposite to what I'm seeing - streams are **slower** for their code. – Abhijit Sarkar Feb 06 '15 at 19:43
@Abhijit Sarkar: I didn’t link to a thread nor answer but a particular comment from Brian Goetz. – Holger Feb 09 '15 at 08:46

score 2 · Answer 2 · edited May 23 '17 at 12:28

2

You do not return any values from your benchmark methods, thus, JMH has no chance to escape the computed values and your benchmark suffers dead code elimination. You compute how long it takes to do nothing. See the JMH page for further guidance.

Saying this, streams can be slower in some cases: Java 8: performance of Streams vs Collections

edited May 23 '17 at 12:28

Community

1
1

answered Feb 06 '15 at 06:42

Rafael Winterhalter

42,759
13
108
192

Probably due to my inexact wording on streams being always slower what is of course not true. Its corrected now. – Rafael Winterhalter Feb 06 '15 at 18:59

Java 8 streams vs iterator performance

2 Answers2