How to decide between lambda iteration and normal loop?

Question

Since he introduction of Java 8 I got really hooked to lambdas and started using them whenever possible, mostly to start getting accustomed to them. One of the most common usage is when we want to iterate and act upon a collection of objects in which case I either resort to forEach or stream(). I rarely write the old for(T t : Ts) loop and I almost forgot about the for(int i = 0.....).

However, we were discussing this with my supervisor the other day and he told me that lambdas aren't always the best choice and can sometimes hinder performance. From a lecture I had seen on this new feature I got the feeling that lambda iterations are always fully optimized by the compiler and will (always?) be better than bare iterations, but he begs to differ. Is this true? If yes how do I distinguish between the best solution in each scenario?

P.S: I'm not talking about cases where it is recommended to apply parallelStream. Obviously those will be faster.

Parallel streams can be slower than normal loops if the CPU reqs and/or stream size is not large. Threads are expensive to manage. — Bohemian, Jun 29 '16 at 06:47
The overhead of the stream framework may be surprising. Measure! — Thorbjørn Ravn Andersen, Jun 29 '16 at 06:51
Take a look at http://blog.takipi.com/benchmark-how-java-8-lambdas-and-streams-can-make-your-code-5-times-slower/ — Michael Markidis, Jun 29 '16 at 06:51
Use the one which you believe is simpler for you to write/understand. This will be different at different times and you understand how to use lambdas better. — Peter Lawrey, Jun 29 '16 at 10:28
I'm comfortable with both methods that's why I'm now looking for efficiency. — PentaKon, Jun 29 '16 at 14:43

score 2 · Answer 1 · edited May 23 '17 at 12:08

Performance depends on so many factors, that it’s hard to predict. Normally, we would say, if your supervisor claims that there was a problem with performance, your supervisor is in charge of explaining what problem.

One thing someone might be afraid of, is that behind the scenes, a class is generated for each lambda creation site (with the current implementation), so if the code in question is executed only once, this might be considered a waste of resources. This harmonizes with the fact that lambda expressions have a higher initialization overhead as the ordinary imperative code (we are not comparing to inner classes here), so inside class initializers, which only run once, you might consider avoiding it. This is also in line with the fact, that you should never use parallel streams in class initializers, so this potential advantage isn’t available here anyway.

For ordinary, frequently executed code that is likely to be optimized by the JVM, these problems do not arise. As you supposed correctly, classes generated for lambda expressions get the same treatment (optimizations) as other classes. At these places, calling forEach on collections bears the potential of being more efficient than a for loop.

The temporary object instances created for an Iterator or the lambda expression are negligible, however, it might be worth noting that a foreach loop will always create an Iterator instance whereas lambda expression do not always do. While the default implementation of Iterable.forEach will create an Iterator as well, some of the most often used collections take the opportunity to provide a specialized implementation, most notably ArrayList.

The ArrayList’s forEach is basically a for loop over an array, without any Iterator. It will then invoke the accept method of the Consumer, which will be a generated class containing a trivial delegation to the synthetic method containing the code of you lambda expression. To optimize the entire loop, the horizon of the optimizer has to span the ArrayList’s loop over an array (a common idiom recognizable for an optimizer), the synthetic accept method containing a trivial delegation and the method containing your actual code.

In contrast, when iterating over the same list using a foreach loop, an Iterator implementation is created containing the ArrayList iteration logic, spread over two methods, hasNext() and next() and instance variables of the Iterator. The loop will repeatedly invoke the hasNext() method to check the end condition (index<size) and next() which will recheck the condition before returning the element, as there is no guaranty that the caller does properly invoke hasNext() before next(). Of course, an optimizer is capable of removing this duplication, but that requires more effort than not having it in the first place. So to get the same performance of the forEach method, the optimizer’s horizon has to span your loop code, the nontrivial hasNext() implementation and the nontrivial next() implementation.

Similar things may apply to other collections having a specialized forEach implementation as well. This also applies to Stream operations, if the source provides a specialized Spliterator implementation, which does not spread the iteration logic over two methods like an Iterator.

So if you want to discuss the technical aspects of foreach vs. forEach(…), you may use these information.

But as said, these aspects describe only potential performance aspects as the work of the optimizer and other runtime environmental aspects may change the outcome completely. I think, as a rule of thumb, the smaller the loop body/action is, the more appropriate is the forEach method. This harmonizes perfectly with the guideline of avoiding overly long lambda expressions anyway.

score 1 · Accepted Answer · answered Jun 29 '16 at 06:54

1

It depends on specific implementation.

In general forEach method and foreach loop over Iterator usually have pretty similar performance as they use similar level of abstraction. stream() is usually slower (often by 50-70%) as it adds another level that provides access to the underlying collection.

The advantages of stream() generally are the possible parallelism and easy chaining of the operations with lot of reusable ones provided by JDK.

answered Jun 29 '16 at 06:54

Zbynek Vyskovsky - kvr000

18,186
3
35
43

1

To the point, so `List.forEach` pays no performance penalty, but unfortunately has not the expressive power of `List.stream/().forEach`. – Joop Eggen Jun 29 '16 at 07:07
1

I don’t agree with the statement of streams being 50-70% slower. In most cases, these numbers stem from incorrect benchmarks measuring first-time overhead. – Holger Jun 30 '16 at 17:15

How to decide between lambda iteration and normal loop?

2 Answers2