How Stream is more efficient?

Question

I am trying to digest Stream package and seems like it's very difficult for me to understand.

I was reading Stream package documentation and at a point I tried to implement it to learn by doing. This is the text I have read:

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.

I understand this much that they provide a new Stream, so my first question is, Is creating a stream without traversing a heavy operation?

Now, since intermediate operations are lazy and terminal operations are eager and also streams are meant to be efficient than old programming standards of if-else and more readable.

Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state. Laziness also allows avoiding examining all the data when it is not necessary; for operations such as "find the first string longer than 1000 characters", it is only necessary to examine just enough strings to find one that has the desired characteristics without examining all of the strings available from the source. (This behavior becomes even more important when the input stream is infinite and not merely large.)

To demonstrate this, I started implemented a small program to understand the concept. Here is the program:

        List<String> stringList = new ArrayList<>();
        for (int i = 0; i < 10000; i++) {
            stringList.add("String" + i);
        }
        long   start  = System.currentTimeMillis();
        Stream stream = stringList.stream().filter(s -> s.contains("99"));
        long   midEnd = System.currentTimeMillis();
        System.out.println("Time is millis before applying terminal operation: " + (midEnd - start));
        System.out.println(stream.findFirst().get());
        long end = System.currentTimeMillis();
        System.out.println("Whole time in millis: " + (end - start));
        System.out.println("Time in millis for Terminal operation: " + (end - midEnd));

        start = System.currentTimeMillis();
        for (String ss1 : stringList) {
            if (ss1.contains("99")) {
                System.out.println(ss1);
                break;
            }
        }
        end = System.currentTimeMillis();
        System.out.println("Time in millis with old standard: " + (end - start));

I have executed this program many times and each time it has proved me that, creating a new stream from intermediate operations is the heavy task to do. Terminal operations do take very little time as compared to intermediate operations.

And overall, old if-else pattern is way more efficient than streams. So, again more questions here:

Did I misunderstand something?
If I understand correct, why and when to use streams?
If I am doing or understanding anything wrong, can you please help clarify my concepts Package java.util.stream?

Actual Numbers:

Try 1:

Time is millis before applying terminal operation: 73
String99
Whole time in millis: 76
Time in millis for Terminal operation: 3
String99
Time in millis with old standard: 0

Try 2:

Time is millis before applying terminal operation: 56
String99
Whole time in millis: 59
Time in millis for Terminal operation: 3
String99
Time in millis with old standard: 0

Try 3:

Time is millis before applying terminal operation: 69
String99
Whole time in millis: 72
Time in millis for Terminal operation: 3
String99
Time in millis with old standard: 0

These are my machine details if this help:

Memory: 11.6 GiB
Processor: Intel® Core™ i7-3632QM CPU @ 2.20GHz × 8 
OS-Type: 64-bit

`stringList.stream().filter(s -> s.contains("99"));` probably takes less than a microsecond - you can't mesaure it like you are doing. Essential read: http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java — assylias, May 11 '16 at 12:30
imho : use it for code readability. Also, writing using `Stream` allows you to easily choose to activate/deactivate parallel computing but it will not magically optimize a code that is already optimal.. — Arnaud Denoyelle, May 11 '16 at 12:32
For one thing, you use a `break`, the stream doesn't know that it can break there and it will filter every element. Nevermind, I see you are only finding the first in the stream. — matt, May 11 '16 at 12:32
@matt intermediate operations are lazy and terminal operations are eager. So, how is it going to traverse whole stream where as package description states it will not. — java8.being, May 11 '16 at 12:34
Maybe you missed my edit, I didn't realize you were using findFirst(), which is a [short-circuit terminal operation](http://docs.oracle.com/javase/8/docs/api/java/util/stream/package-summary.html#StreamOps) so yes, it wouldn't need to filter the whole thing. — matt, May 11 '16 at 12:37
I still don't think that the benchmark is meaningful. I think you can make a rule of thumb here though. `If your task takes less than 1ms to complete, using a stream will not improve the performance.` — matt, May 11 '16 at 13:10
@assylias I understand your point here but the bench mark should produce wrong results in case of `old-style` programming too. :( — java8.being, May 11 '16 at 13:14
You are just adding some overhead by creating an object. You have a task that is not measurable, and you are adding overhead to it. You need to go back to the drawing board for your benchmark. — matt, May 11 '16 at 13:18
One way to think about the sequence of non-terminal operations is as being analogous to "compiling" a regular expression before applying it repeatedly to a lot of Strings. Yes, it takes a few cycles to compile the regular expression, but there is no point benchmarking the compilation--the performance of the repeated operation is what matters. — Hank D, May 11 '16 at 17:20

score 4 · Answer 1 · answered May 11 '16 at 16:54

One of the rationales for the Stream api is that it eliminates the inherent assumption of the for loop, that all iteration happens in the same way. When you use an iterator-based for loop, you are hard-coding the iteration logic to always iterate sequentially. Consider the question, "what if I wanted to change the implementation of the 'for' loop with something more efficient?"

The Stream api addresses that--it abstracts the notion of iteration and allows other ways of processing multiple data points to be considered -- iterate serially vs. in parallel, add optimizations if it is known that the data is unordered, etc.

Consider your example--although you can't change the implementation of the for loop, you can change the implementation of the Stream to suit different situations. For example, if you have more cpu-intensive operations to do on each task, you might choose a parallel Stream. Here's an example with 10 ms delays simulate more complex processing, done in parallel, with very different results:

    List<String> stringList = new ArrayList<>();
    for (int i = 0; i < 10000; i++) {
        stringList.add("String" + i);
    }
    long   start  = System.currentTimeMillis();
    Stream stream = stringList.parallelStream().filter(s -> {
        try {
            Thread.sleep(10);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        return s.contains("99" );});
    long   midEnd = System.currentTimeMillis();
    System.out.println("Time is millis before applying terminal operation: " + (midEnd - start));
    System.out.println(stream.findAny().get());
    long end = System.currentTimeMillis();
    System.out.println("Whole time in millis: " + (end - start));
    System.out.println("Time in millis for Terminal operation: " + (end - midEnd));

    start = System.currentTimeMillis();
    for (String ss1 : stringList) {
        try {
            Thread.sleep(20);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        if (ss1.contains("99")) {
            System.out.println(ss1);
            break;
        }
    }
    end = System.currentTimeMillis();
    System.out.println("Time in millis with old standard: " + (end - start));

I kept the same benchmark logic everyone is complaining about, to make it easier for you to compare.

As you can see, there are situations where for loops will always be more efficient than using a Stream, but Streams offer significant advantages in certain situations as well. It would be unwise to extrapolate from one isolated test that one approach is always better than the other--that is an axiom for life as well.

Alright, so what I understood so far from your response as well as others and my own experience is that, streams can not and should not be confused for efficiency with respect to iterations? — java8.being, May 11 '16 at 16:58

score 3 · Answer 2 · answered May 11 '16 at 12:50

3

Unless your tests involve JMH, then your code is pretty much a proof of nothing and even worse, it will give an ALTERED impression of reality.

assylias made the comment that should make it clear on what goes wrong.

Also your measurements of the "intermediate operation" and then the "short circuit" are also wrong. The intermediate operation, because it is lazy, does nothing really, it will only take place when a terminal one will kick in.

If you ever worked with guava, this is how transform/filter is done in their code also, at least logically.

answered May 11 '16 at 12:50

Eugene

117,005
15
201
306

`The intermediate operation, because it is lazy, does nothing really, it will only take place when a terminal one will kick in.` I agree. And no I have never worked with guava. – java8.being May 11 '16 at 13:16
1

While it is correct to emphasize the incorrect benchmark of the question, stating that using JMH was mandatory for a correct benchmark, isn’t better. While JMH can help avoiding certain benchmark mistakes, it’s still just a tool. There might be developers capable of avoiding these mistakes themself (after all, the JMH authors are developers too) and you can still make errors while using JMH (especially when you overestimate what the tool can do for you). – Holger May 11 '16 at 15:20
1

@Holger you're probably right, from what I did with the tests via Caliper, JMH and long looping, JMH does things best IMO. So every time I hear, these are my micro benchmark results, I immediately jump saying USE JMH. It feels so natural and probably a bit wrong. Good point, thx – Eugene May 11 '16 at 15:27
2

There is nothing wrong in recommending JMH, it’s just not correct to say something along “*unless your tests involve JMH*, …[it can’t be correct]”. That’s too strong. And I would always recommend to “use JMH” *and* “to carefully read its documentation”… – Holger May 11 '16 at 15:31

score 2 · Answer 3 · answered May 11 '16 at 12:42

2

I really don't trust your "benchmark", because too many things can go wrong, you better use a framework. But anyways, when people or docs say it is more efficient they don't mean the example you provided.

Streams as lifted collection (they don't hold data) are more efficient than eager ones like Scala Lists for instance where a filter allocates a new List and the map transforms the results to a new List.

When we compare with this implementation Streams win. But yeah streams allocate objects which is vey cheap on modern JVMs and looked after in modern GC's.

answered May 11 '16 at 12:42

Sleiman Jneidi

22,907
14
56
77

Can you probably provide with an example that illustrates this please? – java8.being May 11 '16 at 13:15
@java8.being illustrates what? – Sleiman Jneidi May 11 '16 at 13:40

score 2 · Accepted Answer · answered May 11 '16 at 15:00

As others have already have noted your benchmark is flawed. The main problem is that the results are skewed by ignoring compilation time. Try the following:

    Stream stream = stringList.stream().filter(s -> s.contains("99"));
    long   start  = System.currentTimeMillis();
    stream = stringList.stream().filter(s -> s.contains("99"));
    long   midEnd = System.currentTimeMillis();

Now the code that backs filter is already compiled and the second call is fast. Even this would work:

    Stream stream = stringList.stream().map(s -> s);
    long   start  = System.currentTimeMillis();
    stream = stringList.stream().filter(s -> s.contains("99"));
    long   midEnd = System.currentTimeMillis();

map shares most of the code with filter, so calling filter is fast here, too, because the code is already compiled. And in case you ask: Calling filter or map on a different stream would work too, of course.

Your "old style" code doesn't require additional compilation.

Ahaa, now I see. Even though my benchmark was flawed, everyone told me that (I accept that). But you are the only one who really helped me understanding about compilation time it takes for the filter. — java8.being, May 11 '16 at 15:04
You are still measuring the initialization time for the second lambda expression. Executing the code a second time will be even faster. Alternatively, you can store `s -> s.contains("99")` in a `Predicate` variable and re-use it. Besides that, always use `System.nanoTime()` for measuring *elapsed time*. — Holger, May 11 '16 at 15:24

How Stream is more efficient?

4 Answers4