548

Which of the following is better practice in Java 8?

Java 8:

joins.forEach(join -> mIrc.join(mSession, join));

Java 7:

for (String join : joins) {
    mIrc.join(mSession, join);
}

I have lots of for loops that could be "simplified" with lambdas, but is there really any advantage of using them? Would it improve their performance and readability?

EDIT

I'll also extend this question to longer methods. I know that you can't return or break the parent function from a lambda and this should also be taken into consideration when comparing them, but is there anything else to be considered?

Rafael
  • 7,605
  • 13
  • 31
  • 46
nebkat
  • 8,445
  • 9
  • 41
  • 60
  • I believe lamdas are intended to improve use of multi-core processors. If your target environment is multi-core, I believe lamdas will have improved performance over the java 7 for loop. – DwB May 19 '13 at 14:04
  • 15
    There is no real performance advantage of one over another. First option is something inspired by FP (whitch is commonly talked about like more "nice" and "clear" way to express your code). In reality - this is rather "style" question. – Eugene Loy May 19 '13 at 14:04
  • 5
    @Dwb: in this case, that is not relevant. forEach is not defined as being parallel or anything like that, so these two things are semantically equivalent. Of course it is possible to implement a parallel version of forEach (and one might already be present in the standard library), and in such a case lambda expression syntax would be very useful. – AardvarkSoup May 19 '13 at 14:05
  • 9
    @AardvarkSoup The instance on which forEach is called is a Stream (http://lambdadoc.net/api/java/util/stream/Stream.html). To request a parallel execution one could write joins.parallel().forEach(...) – mschenk74 May 19 '13 at 15:20
  • 14
    Is `joins.forEach((join) -> mIrc.join(mSession, join));` really a "simplification" of `for (String join : joins) { mIrc.join(mSession, join); }`? You've increased the punctuation count from 9 to 12, for the benefit of hiding the type of `join`. What you've really done is to put two statements onto one line. – Tom Hawtin - tackline Jul 26 '13 at 16:55
  • 1
    The parenthesis is not required around the left side of the lambda. So could be written as: joins.forEach( join -> mIrc.join(mSession,join); – The Coordinator Oct 21 '13 at 10:01
  • 1
    Related question: http://stackoverflow.com/questions/23218874/what-is-difference-between-collection-stream-foreach-and-collection-foreac – Stuart Marks Apr 26 '14 at 03:27
  • 9
    Another point to consider is the limited variable capture ability of Java. With Stream.forEach(), you can't update local variables since their capture makes them final, which means you can have a stateful behaviour in the forEach lambda (unless you are prepared for some ugliness such as using class state variables). – RedGlyph Aug 16 '14 at 19:42
  • depends on the use case. – Najeeb Arif Jul 11 '16 at 09:32
  • if you need parallel programming enabled then use streams other wise simply use the trivial approach. – Najeeb Arif Jul 11 '16 at 09:33
  • 1
    Implementing Parallelism without the proper need will include undue overheads. – Najeeb Arif Jul 11 '16 at 09:33

8 Answers8

633

The better practice is to use for-each. Besides violating the Keep It Simple, Stupid principle, the new-fangled forEach() has at least the following deficiencies:

  • Can't use non-final variables. So, code like the following can't be turned into a forEach lambda:
Object prev = null;
for(Object curr : list)
{
    if( prev != null )
        foo(prev, curr);
    prev = curr;
}
  • Can't handle checked exceptions. Lambdas aren't actually forbidden from throwing checked exceptions, but common functional interfaces like Consumer don't declare any. Therefore, any code that throws checked exceptions must wrap them in try-catch or Throwables.propagate(). But even if you do that, it's not always clear what happens to the thrown exception. It could get swallowed somewhere in the guts of forEach()

  • Limited flow-control. A return in a lambda equals a continue in a for-each, but there is no equivalent to a break. It's also difficult to do things like return values, short circuit, or set flags (which would have alleviated things a bit, if it wasn't a violation of the no non-final variables rule). "This is not just an optimization, but critical when you consider that some sequences (like reading the lines in a file) may have side-effects, or you may have an infinite sequence."

  • Might execute in parallel, which is a horrible, horrible thing for all but the 0.1% of your code that needs to be optimized. Any parallel code has to be thought through (even if it doesn't use locks, volatiles, and other particularly nasty aspects of traditional multi-threaded execution). Any bug will be tough to find.

  • Might hurt performance, because the JIT can't optimize forEach()+lambda to the same extent as plain loops, especially now that lambdas are new. By "optimization" I do not mean the overhead of calling lambdas (which is small), but to the sophisticated analysis and transformation that the modern JIT compiler performs on running code.

  • If you do need parallelism, it is probably much faster and not much more difficult to use an ExecutorService. Streams are both automagical (read: don't know much about your problem) and use a specialized (read: inefficient for the general case) parallelization strategy (fork-join recursive decomposition).

  • Makes debugging more confusing, because of the nested call hierarchy and, god forbid, parallel execution. The debugger may have issues displaying variables from the surrounding code, and things like step-through may not work as expected.

  • Streams in general are more difficult to code, read, and debug. Actually, this is true of complex "fluent" APIs in general. The combination of complex single statements, heavy use of generics, and lack of intermediate variables conspire to produce confusing error messages and frustrate debugging. Instead of "this method doesn't have an overload for type X" you get an error message closer to "somewhere you messed up the types, but we don't know where or how." Similarly, you can't step through and examine things in a debugger as easily as when the code is broken into multiple statements, and intermediate values are saved to variables. Finally, reading the code and understanding the types and behavior at each stage of execution may be non-trivial.

  • Sticks out like a sore thumb. The Java language already has the for-each statement. Why replace it with a function call? Why encourage hiding side-effects somewhere in expressions? Why encourage unwieldy one-liners? Mixing regular for-each and new forEach willy-nilly is bad style. Code should speak in idioms (patterns that are quick to comprehend due to their repetition), and the fewer idioms are used the clearer the code is and less time is spent deciding which idiom to use (a big time-drain for perfectionists like myself!).

As you can see, I'm not a big fan of the forEach() except in cases when it makes sense.

Particularly offensive to me is the fact that Stream does not implement Iterable (despite actually having method iterator) and cannot be used in a for-each, only with a forEach(). I recommend casting Streams into Iterables with (Iterable<T>)stream::iterator. A better alternative is to use StreamEx which fixes a number of Stream API problems, including implementing Iterable.

That said, forEach() is useful for the following:

  • Atomically iterating over a synchronized list. Prior to this, a list generated with Collections.synchronizedList() was atomic with respect to things like get or set, but was not thread-safe when iterating.

  • Parallel execution (using an appropriate parallel stream). This saves you a few lines of code vs using an ExecutorService, if your problem matches the performance assumptions built into Streams and Spliterators.

  • Specific containers which, like the synchronized list, benefit from being in control of iteration (although this is largely theoretical unless people can bring up more examples)

  • Calling a single function more cleanly by using forEach() and a method reference argument (ie, list.forEach (obj::someMethod)). However, keep in mind the points on checked exceptions, more difficult debugging, and reducing the number of idioms you use when writing code.

Articles I used for reference:

EDIT: Looks like some of the original proposals for lambdas (such as http://www.javac.info/closures-v06a.html Google Cache) solved some of the issues I mentioned (while adding their own complications, of course).

Arlo
  • 1,331
  • 2
  • 15
  • 26
Aleksandr Dubinsky
  • 22,436
  • 15
  • 82
  • 99
  • 2
    The supposed explanation why Stream doesn't implement Iterable is that in Java iterables are meant to always be re-usable (ie, can call iterator() multiple times), while in C# they're explicitly allowed to be single-use. But it doesn't seem like a good reason. Most C# developers don't appreciate this fine point, and it's not a huge source of bugs. In fact, there are already some Iterables in Java that are single-use. – Aleksandr Dubinsky Nov 27 '13 at 08:52
  • 110
    “Why encourage hiding side-effects somewhere in expressions?” is the wrong question. The functional `forEach` is there to encourage the functional style, i.e. using expressions *without* side-effects. If you encounter the situation the the `forEach` does not work well with your side-effects you should get the feeling that your are not using the right tool for the job. Then the simple answer is, that’s because your feeling is right, so stay at the for-each loop for that. The classical `for` loop did not become deprecated… – Holger Dec 03 '13 at 18:55
  • 22
    @Holger How can `forEach` be used without side-effects? – Aleksandr Dubinsky Dec 03 '13 at 20:56
  • 18
    All right, I wasn’t precise enough, `forEach` is the only stream operation intended for side-effects, but it’s not for side-effects like your example code, counting is a typical `reduce` operation. I would suggest, as a rule of thump, to keep every operation which manipulates local variables or shall influence the control flow (incl exception handling) in a classical `for` loop. Regarding the original question I think, the problem stems from the fact that someone uses a stream where a simple `for` loop over the source of the stream would be sufficient. Use a stream where `forEach()` works only – Holger Dec 04 '13 at 13:54
  • 8
    @Holger What is an example of side-effects that `forEach` would be appropriate for? – Aleksandr Dubinsky Dec 04 '13 at 15:03
  • 38
    Something which processes each item individually and doesn’t try to mutate local variables. E.g. manipulating the items itself or printing them, writing/sending them to a file, network stream, etc. It’s no problem for me if you question these examples and don’t see any application for it; filtering, mapping, reducing, searching, and (to a lesser degree) collecting are the preferred operations of a stream. The forEach looks like a convenience to me for linking with existing APIs. And for parallel operations, of course. These won’t work with `for` loops. – Holger Dec 04 '13 at 17:21
  • 2
    Regarding your non-final variable example, if you're trying to use `forEach` to solve a counting problem you're doing it wrong. Counting may be expressed as a reduction like `myList.stream().reduce(0, (count, str) -> count + 1, (count1, count2) -> count1 + count2);` Is this an stylistic improvement over `for-each`? Not for this example, but is `for-each` an improvement over `myList.size()`? – Mark Apr 15 '14 at 18:53
  • @Mark actually `.count()` fulfills the first example. – Simon Kuang Apr 22 '14 at 21:46
  • 6
    @SimonKuang: My point was that the code example is not an appropriate application of `forEach`. The appropriate functional idiom is reduction. Presenting an inappropriate use case for `forEach` and then claiming it is deficient because it cannot be used to solve it is misleading. – Mark Apr 22 '14 at 22:28
  • 1
    @Mark The example is bad. It was the first 3 lines of code that came to mind. There are, of course, lots of other loop bodies that modify variables that can't be replaced with primitive stream operations ([too many, in fact](http://stackoverflow.com/questions/20470010/java-8-collect-successive-pairs-from-a-stream])). – Aleksandr Dubinsky Apr 22 '14 at 23:11
  • First example changed. Was: `int counter = 0; for(String string : myList) counter++;` – Aleksandr Dubinsky Apr 22 '14 at 23:24
  • @AleksandrDubinsky For what it's worth, I think that's a way better example. If `Stream.zip()` were still around there would be a kludgy way to handle those kinds of problems, but it would still be ugly. The more elegant functional solution would use recursion, but since Java does not support tail recursion optimization this would be a poor solution in Java. – Mark Apr 24 '14 at 01:44
  • 1
    "Might execute in parallel, which is a horrible, horrible thing for all but the 0.1% of your code that needs to be optimized." <- this is why we can't have nice things. – Griwes Jun 23 '15 at 12:25
  • 7
    I'm confused about *might execute in parallel*, does this mean it might do that even if you don't explicitly tell it to? I was rewriting some of my loops as `.forEach()`, then read this answer and stopped being unsure.. I don't want my code accidentally executing in another thread somehow without me being aware of it – ycomp Nov 03 '15 at 23:51
  • 6
    @ycomp Sorry for the confusion. It won't execute in parallel on its own, unless you get a Stream from another source and can't control how it was created. The "might" was meant more like, "you or someone else might decide to have it run in parallel for no good reason." – Aleksandr Dubinsky Nov 04 '15 at 00:55
  • 3
    Another oddity is since Java Arrays do not implement `java.lang.Iterable` interface, you can't use `forEach` with them. Especially for a newcomer, it is rather difficult to correctly grasp why `List`s are Iterable, but not arrays! – Kedar Mhaswade Feb 17 '16 at 17:09
  • 1
    This answer's author is right IMHO. Some comments here do sound weird, since to me it's clear that *any* possible application of `forEach()` is necessarily related to side effects, by definition (manipulating items, printing, sending to file are also side-effects). And if that's the case, why try to "disguise" a foreach loops as a "functional" pseudo-expression? – rsenna Jul 07 '16 at 16:43
  • 2
    "Stream does not implement Iterable" --> is for good reason: a `Stream` (like Iterator) is a one-use object, whereas an `Iterable` must produce a *new* iterator upon each request. This should not be the `Stream`'s responsibility, but the *source* of the stream (like Collection) can implement it. – Luke Usherwood Jan 13 '17 at 09:07
  • @LukeUsherwood I know. Look at the very first comment for why I don't find this argument convincing. – Aleksandr Dubinsky Jan 24 '17 at 11:01
  • Yeah spotted it after posting. Whether or not one agrees with the design, that is the defined contract, so any 'single-use' `Iterable` is fundamentally broken: yes it'll work in a 'for-each', but not other valid uses. (Case in point: many Guava library functions take `Iterable` instead of `Collection` as it's more abstract, hence more general & flexible. You should be able to call any number of such functions.) Anyway, this is a good list to reference regarding for-each vs `forEach()`, so thanks for compiling! – Luke Usherwood Jan 25 '17 at 08:22
  • "As you can see, I'm not a big fan of the forEach() except in cases when it makes sense." - Why on Earth would you use it if it does not make sense? – greg Sep 12 '18 at 12:37
  • 2
    Saying streams are more difficult to code/read is subjective and your opinion. Actually fluent/declarative APIs are designed to be more readable, and those with a FP background find them easier to code. So not sure why this is added to the answer. – wilmol Sep 08 '19 at 10:19
  • Also Java designers say ```foreach``` (and ```peek```) should only be used for reporting results (printing, logging etc.). Otherwise favour side-effect free operations or use a traditional for-each loop. – wilmol Sep 08 '19 at 10:21
  • Most of these "issues" are opinions or pertain to streams and other hypothetical use cases OP didn't ask about. The correct answer is "whichever you prefer". – shmosel Dec 29 '21 at 04:23
174

The advantage comes into account when the operations can be executed in parallel. (See http://java.dzone.com/articles/devoxx-2012-java-8-lambda-and - the section about internal and external iteration)

  • The main advantage from my point of view is that the implementation of what is to be done within the loop can be defined without having to decide if it will be executed in parallel or sequential

  • If you want your loop to be executed in parallel you could simply write

     joins.parallelStream().forEach(join -> mIrc.join(mSession, join));
    

    You will have to write some extra code for thread handling etc.

Note: For my answer I assumed joins implementing the java.util.Stream interface. If joins implements only the java.util.Iterable interface this is no longer true.

Lii
  • 11,553
  • 8
  • 64
  • 88
mschenk74
  • 3,561
  • 1
  • 21
  • 34
  • 4
    The slides of an oracle engineer he is referencing to (https://blogs.oracle.com/darcy/resource/Devoxx/Devoxx2012_ProjectLambda.pdf) don't mention the parallelism within those lambda expressions. The parallelism may occur within the bulk collection methods like `map` & `fold` that aren't really related to lambdas. – Thomas Jungblut May 19 '13 at 14:12
  • 1
    It doesn't really seem that OP's code will benefit from automatic parallelism here (especially since there is no guarantee that there will be one). We don't really know what is "mIrc", but "join" doesn't really seem like something that can be exexuted out-of order. – Eugene Loy May 19 '13 at 14:18
  • @ThomasJungblut In the API (http://lambdadoc.net/api/java/util/stream/Stream.html#forEach%28java.util.function.Consumer%29) there is some documentation about parallel and forEach. – mschenk74 May 19 '13 at 14:21
  • @leo I interpreted mIrc and join as an IRC client joining some channels within a chat session. But that might be also complete nonsense since the questions doesn't define it. – mschenk74 May 19 '13 at 15:32
  • @mschenk74 That is correct, but it is only one of many one-line for loops in my code. I don't need parallel execution in my case but +1 for a valid reason why lambda is better than for loop. – nebkat May 19 '13 at 15:55
  • 12
    `Stream#forEach` and `Iterable#forEach` are not the same thing. OP is asking about `Iterable#forEach`. – gvlasov Oct 28 '14 at 19:38
  • 2
    I used the UPDATEX style since there were changes in the specification between the time the question was asked and the time the answer got updated. Without the history of the answer it would be even more confusing I thought. – mschenk74 Apr 15 '15 at 07:12
  • 1
    Could anyone please explain to me why this answer is not valid if `joins` is implementing `Iterable` instead of `Stream`? From a couple of things I've read, OP should be able to do `joins.stream().forEach((join) -> mIrc.join(mSession, join));` and `joins.parallelStream().forEach((join) -> mIrc.join(mSession, join));` if `joins` implements `Iterable` – Blueriver Aug 18 '16 at 04:12
  • An argument for using for loops might be because checked exceptions thrown inside of lambdas have to be caught within that same code block - you can't add them to the calling method's signature like you can with a simple loop. – b15 Mar 04 '22 at 18:28
130

When reading this question one can get the impression, that Iterable#forEach in combination with lambda expressions is a shortcut/replacement for writing a traditional for-each loop. This is simply not true. This code from the OP:

joins.forEach(join -> mIrc.join(mSession, join));

is not intended as a shortcut for writing

for (String join : joins) {
    mIrc.join(mSession, join);
}

and should certainly not be used in this way. Instead it is intended as a shortcut (although it is not exactly the same) for writing

joins.forEach(new Consumer<T>() {
    @Override
    public void accept(T join) {
        mIrc.join(mSession, join);
    }
});

And it is as a replacement for the following Java 7 code:

final Consumer<T> c = new Consumer<T>() {
    @Override
    public void accept(T join) {
        mIrc.join(mSession, join);
    }
};
for (T t : joins) {
    c.accept(t);
}

Replacing the body of a loop with a functional interface, as in the examples above, makes your code more explicit: You are saying that (1) the body of the loop does not affect the surrounding code and control flow, and (2) the body of the loop may be replaced with a different implementation of the function, without affecting the surrounding code. Not being able to access non final variables of the outer scope is not a deficit of functions/lambdas, it is a feature that distinguishes the semantics of Iterable#forEach from the semantics of a traditional for-each loop. Once one gets used to the syntax of Iterable#forEach, it makes the code more readable, because you immediately get this additional information about the code.

Traditional for-each loops will certainly stay good practice (to avoid the overused term "best practice") in Java. But this doesn't mean, that Iterable#forEach should be considered bad practice or bad style. It is always good practice, to use the right tool for doing the job, and this includes mixing traditional for-each loops with Iterable#forEach, where it makes sense.

Since the downsides of Iterable#forEach have already been discussed in this thread, here are some reasons, why you might probably want to use Iterable#forEach:

  • To make your code more explicit: As described above, Iterable#forEach can make your code more explicit and readable in some situations.

  • To make your code more extensible and maintainable: Using a function as the body of a loop allows you to replace this function with different implementations (see Strategy Pattern). You could e.g. easily replace the lambda expression with a method call, that may be overwritten by sub-classes:

    joins.forEach(getJoinStrategy());
    

    Then you could provide default strategies using an enum, that implements the functional interface. This not only makes your code more extensible, it also increases maintainability because it decouples the loop implementation from the loop declaration.

  • To make your code more debuggable: Seperating the loop implementation from the declaration can also make debugging more easy, because you could have a specialized debug implementation, that prints out debug messages, without the need to clutter your main code with if(DEBUG)System.out.println(). The debug implementation could e.g. be a delegate, that decorates the actual function implementation.

  • To optimize performance-critical code: Contrary to some of the assertions in this thread, Iterable#forEach does already provide better performance than a traditional for-each loop, at least when using ArrayList and running Hotspot in "-client" mode. While this performance boost is small and negligible for most use cases, there are situations, where this extra performance can make a difference. E.g. library maintainers will certainly want to evaluate, if some of their existing loop implementations should be replaced with Iterable#forEach.

    To back this statement up with facts, I have done some micro-benchmarks with Caliper. Here is the test code (latest Caliper from git is needed):

    @VmOptions("-server")
    public class Java8IterationBenchmarks {
    
        public static class TestObject {
            public int result;
        }
    
        public @Param({"100", "10000"}) int elementCount;
    
        ArrayList<TestObject> list;
        TestObject[] array;
    
        @BeforeExperiment
        public void setup(){
            list = new ArrayList<>(elementCount);
            for (int i = 0; i < elementCount; i++) {
                list.add(new TestObject());
            }
            array = list.toArray(new TestObject[list.size()]);
        }
    
        @Benchmark
        public void timeTraditionalForEach(int reps){
            for (int i = 0; i < reps; i++) {
                for (TestObject t : list) {
                    t.result++;
                }
            }
            return;
        }
    
        @Benchmark
        public void timeForEachAnonymousClass(int reps){
            for (int i = 0; i < reps; i++) {
                list.forEach(new Consumer<TestObject>() {
                    @Override
                    public void accept(TestObject t) {
                        t.result++;
                    }
                });
            }
            return;
        }
    
        @Benchmark
        public void timeForEachLambda(int reps){
            for (int i = 0; i < reps; i++) {
                list.forEach(t -> t.result++);
            }
            return;
        }
    
        @Benchmark
        public void timeForEachOverArray(int reps){
            for (int i = 0; i < reps; i++) {
                for (TestObject t : array) {
                    t.result++;
                }
            }
        }
    }
    

    And here are the results:

    When running with "-client", Iterable#forEach outperforms the traditional for loop over an ArrayList, but is still slower than directly iterating over an array. When running with "-server", the performance of all approaches is about the same.

  • To provide optional support for parallel execution: It has already been said here, that the possibility to execute the functional interface of Iterable#forEach in parallel using streams, is certainly an important aspect. Since Collection#parallelStream() does not guarantee, that the loop is actually executed in parallel, one must consider this an optional feature. By iterating over your list with list.parallelStream().forEach(...);, you explicitly say: This loop supports parallel execution, but it does not depend on it. Again, this is a feature and not a deficit!

    By moving the decision for parallel execution away from your actual loop implementation, you allow optional optimization of your code, without affecting the code itself, which is a good thing. Also, if the default parallel stream implementation does not fit your needs, no one is preventing you from providing your own implementation. You could e.g. provide an optimized collection depending on the underlying operating system, on the size of the collection, on the number of cores, and on some preference settings:

    public abstract class MyOptimizedCollection<E> implements Collection<E>{
        private enum OperatingSystem{
            LINUX, WINDOWS, ANDROID
        }
        private OperatingSystem operatingSystem = OperatingSystem.WINDOWS;
        private int numberOfCores = Runtime.getRuntime().availableProcessors();
        private Collection<E> delegate;
    
        @Override
        public Stream<E> parallelStream() {
            if (!System.getProperty("parallelSupport").equals("true")) {
                return this.delegate.stream();
            }
            switch (operatingSystem) {
                case WINDOWS:
                    if (numberOfCores > 3 && delegate.size() > 10000) {
                        return this.delegate.parallelStream();
                    }else{
                        return this.delegate.stream();
                    }
                case LINUX:
                    return SomeVerySpecialStreamImplementation.stream(this.delegate.spliterator());
                case ANDROID:
                default:
                    return this.delegate.stream();
            }
        }
    }
    

    The nice thing here is, that your loop implementation doesn't need to know or care about these details.

Balder
  • 8,623
  • 4
  • 39
  • 61
  • 5
    You have an interesting view in this discussion and bring up a number of points. I'll try to address them. You propose to switch between `forEach` and `for-each` based on some criteria regarding the nature of the loop body. Wisdom and discipline to follow such rules are the hallmark of a good programmer. Such rules are also his bane, because the people around him either don't follow them or disagree. Eg, using checked vs unchecked Exceptions. This situation seems even more nuanced. But, if the body "does not affect surround code or flow control," isn't factoring it out as a function better? – Aleksandr Dubinsky Apr 22 '14 at 22:43
  • To make code more extensible, you argue factoring out member functions (which can be called from a `for-each`) and using function objects (which can also be called from a `for-each`). I am not opposed to either of those techniques, and use function objects too. I am just not sure why they encourage the use of `forEach()` (which, again, won't let them use checked exceptions). Yes, the syntax is a bit shorter. What other reasons? – Aleksandr Dubinsky Apr 22 '14 at 22:46
  • Arguing performance is a can of worms. The optimizations that the JVM can do to a `for-each` go beyond the cost of a list access. The JVM can move redundant expression evaluations, allocate objects on the stack instead of the heap, etc, etc. This is where lambdas fall behind. – Aleksandr Dubinsky Apr 22 '14 at 22:58
  • I have been disappointed with the parallel execution of streams. E.g., the performance is low if the loop body is heavy with few iterations. Rather than write my own collection or spliterator, it's easier to use `ExecutorService`. **And** it also takes lambdas. – Aleksandr Dubinsky Apr 22 '14 at 23:08
  • 5
    Thanks for the detailed comments Aleksandr. `But, if the body "does not affect surround code or flow control," isn't factoring it out as a function better?`. Yes, this will often be the case in my opinion - factoring out these loops as functions is a natural consequence. – Balder Apr 23 '14 at 05:06
  • 2
    Regarding the performance issue - I guess it depends very much on the nature of the loop. In a project I'm working on, I have been using function-style loops similar to `Iterable#forEach` before Java 8 just because of the performance increase. The project in question has one main loop similar to a game loop, with an undefined number of nested sub-loops, where clients can plug-in loop participants as functions. Such a software structure greatly benefits from `Iteable#forEach`. – Balder Apr 23 '14 at 05:07
  • 2
    My whole point is, that one should not compare the two loop approaches in the sense that one is generally preferable to the other. Both loops have their uses, and it is the responsibility of the programmer to know the ins and outs of both loops, so he can effectively use them. – Balder Apr 23 '14 at 05:07
  • 8
    There is a sentence at the very end of my critique: "Code should speak in idioms, and the fewer idioms are used the clearer the code is and less time is spent deciding which idiom to use". I began to deeply appreciate this point when I switched from C# to Java. – Aleksandr Dubinsky Apr 23 '14 at 16:07
  • 8
    That's a bad argument. You could use it to justify anything you want: why you shouldn't use a for loop, because a while loop is good enough and that's one less idiom. Heck, why use any loop, switch, or try/catch statement when goto can do all of that and more. – tapichu Nov 18 '14 at 14:24
  • 1
    @AleksandrDubinsky With regard to, "the fewer idioms are used the clearer the code is and less time is spent deciding which idiom to use", that's not true. The fewer idioms used, the more the programmer may have to struggle to fit a triangular idiom into a square slot. There should be exactly as many idioms as are necessary to code well. – steventrouble Mar 16 '15 at 22:55
  • 2
    @tapichu Actually, goto leads to many more "idioms"--very creative solutions that go beyond if, for, while, return, throw. That's what made it bad. For loop restricts weird things that can be done with while loops. Etc. – Aleksandr Dubinsky Mar 20 '15 at 03:21
  • @steventrouble To continue with the goto analogy, its flexibility let you fit a fractally-bounded shape into an appropriately contorted hole. Modern programming is about restriction. When you become more experienced, you'll understand. You'll also realize that the statement "there should be exactly as many idioms as are necessary to code well" is not actually an argument for anything. – Aleksandr Dubinsky Mar 20 '15 at 03:26
  • I'll only say `+int(PI/3)` *for Science!* - strictly speaking, for the Caliper benchmark... it certainly proves that `forEac`h is *not* a CPU hog as believed by many (the "idiomatic-ness" of it & JDK 8 market penetration [Android anyone?] being another thing, though). –  May 05 '15 at 22:07
  • Can someone explain verbosely why `Iterable#forEach()` outperforms traditional loop? – dma_k Apr 10 '17 at 08:58
  • @AleksandrDubinsky - I find your comparison to C# suspect since C# does not expose a foreach lambda construct at all except in the case of explicit parallelization. In this case, it is Java that has too many idioms :P – hoodaticus Jun 09 '17 at 18:39
  • 1
    This is a very balanced response to Aleksandr's above. I think it is is well written. It would be best to emphasize, or put first, the paragraph about using the right tool for the job. I think that provides an important context to the rest of the narrative. – SimplyKnownAsG Mar 15 '22 at 16:51
13

forEach() can be implemented to be faster than for-each loop, because the iterable knows the best way to iterate its elements, as opposed to the standard iterator way. So the difference is loop internally or loop externally.

For example ArrayList.forEach(action) may be simply implemented as

for(int i=0; i<size; i++)
    action.accept(elements[i])

as opposed to the for-each loop which requires a lot of scaffolding

Iterator iter = list.iterator();
while(iter.hasNext())
    Object next = iter.next();
    do something with `next`

However, we also need to account for two overhead costs by using forEach(), one is making the lambda object, the other is invoking the lambda method. They are probably not significant.

see also http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/ for comparing internal/external iterations for different use cases.

ZhongYu
  • 19,446
  • 5
  • 33
  • 61
  • 9
    why does the iterable know the best way but the iterator does not? – mschenk74 May 19 '13 at 18:15
  • 2
    no essential difference, but extra code are needed to conform to the iterator interface, which may be more costly. – ZhongYu May 19 '13 at 18:20
  • 1
    @zhong.j.yu if you implement Collection you also implement Iterable anyway. So, there is no code overhead in terms of "adding more code to implement missing interface methods", if that's your point. As mschenk74 said there seems to be no no reasons why you cannot tweak your iterator to know how to iterate over your collection in the best possible way. I do agree that there might be overhead for iterator creation, but seriously, those things usually so cheap, that you can say that they have zero cost... – Eugene Loy May 19 '13 at 18:36
  • 4
    for example iterating a tree: `void forEach(Consumer v){leftTree.forEach(v);v.accept(rootElem);rightTree.forEach(v);}`, this is more elegant than the external iteration, and you can decide on how to best synchronize – ratchet freak May 19 '13 at 19:03
  • @leo the authors of these collections are -extremely- concerned with performance. – ZhongYu May 19 '13 at 20:49
  • 1
    Funnily enough, the only comment in the `String.join` methods (okay, wrong join) is "Number of elements not likely worth Arrays.stream overhead." so they use a posh for loop. – Tom Hawtin - tackline Jul 26 '13 at 16:07
10

TL;DR: List.stream().forEach() was the fastest.

I felt I should add my results from benchmarking iteration. I took a very simple approach (no benchmarking frameworks) and benchmarked 5 different methods:

  1. classic for
  2. classic foreach
  3. List.forEach()
  4. List.stream().forEach()
  5. List.parallelStream().forEach

the testing procedure and parameters

private List<Integer> list;
private final int size = 1_000_000;

public MyClass(){
    list = new ArrayList<>();
    Random rand = new Random();
    for (int i = 0; i < size; ++i) {
        list.add(rand.nextInt(size * 50));
    }    
}
private void doIt(Integer i) {
    i *= 2; //so it won't get JITed out
}

The list in this class shall be iterated over and have some doIt(Integer i) applied to all it's members, each time via a different method. in the Main class I run the tested method three times to warm up the JVM. I then run the test method 1000 times summing the time it takes for each iteration method (using System.nanoTime()). After that's done i divide that sum by 1000 and that's the result, average time. example:

myClass.fored();
myClass.fored();
myClass.fored();
for (int i = 0; i < reps; ++i) {
    begin = System.nanoTime();
    myClass.fored();
    end = System.nanoTime();
    nanoSum += end - begin;
}
System.out.println(nanoSum / reps);

I ran this on a i5 4 core CPU, with java version 1.8.0_05

classic for

for(int i = 0, l = list.size(); i < l; ++i) {
    doIt(list.get(i));
}

execution time: 4.21 ms

classic foreach

for(Integer i : list) {
    doIt(i);
}

execution time: 5.95 ms

List.forEach()

list.forEach((i) -> doIt(i));

execution time: 3.11 ms

List.stream().forEach()

list.stream().forEach((i) -> doIt(i));

execution time: 2.79 ms

List.parallelStream().forEach

list.parallelStream().forEach((i) -> doIt(i));

execution time: 3.6 ms

Assaf
  • 1,352
  • 10
  • 19
  • 26
    How do you get those numbers? Which framework for benchmark are you using? If you're using none and just plain `System.out.println` to display this data naively, then all the results are useless. – Luiggi Mendoza Jan 13 '15 at 15:51
  • 2
    No framework. I use `System.nanoTime()`. If you read the answer you'll see how it was done. I don't think that makes it useless seeing as this is a _relative_ question. I don't care how well a certain method did, I care how well it did compared to the other methods. – Assaf Jan 15 '15 at 06:08
  • 35
    And that's the purpose of a good micro benchmark. Since you haven't met such requirements, the results are useless. – Luiggi Mendoza Jan 15 '15 at 14:29
  • 7
    I can recommend getting to know JMH instead, this is what's being used for Java itself and puts a lot of effort of getting correct numbers: http://openjdk.java.net/projects/code-tools/jmh/ – dsvensson Apr 09 '15 at 12:45
  • 1
    I agree with @LuiggiMendoza. There is no way to know that these results are consistent or valid. God knows how many benchmarks I have done that keeps reporting different results, especially depending on iteration order, size and what not. – mjs Jun 19 '16 at 08:45
  • 1
    Sigh. In any sane compiler a classic for loop should be far faster than anything except maybe (but usually not) the parallel one. What this tells me is that the JVM is better at inlining than loop unrolling. – hoodaticus Jun 09 '17 at 18:44
  • use JMH instead of System.nanoTime and System.out.println – Awan Biru Aug 25 '19 at 19:14
9

I feel that I need to extend my comment a bit...

About paradigm\style

That's probably the most notable aspect. FP became popular due to what you can get avoiding side-effects. I won't delve deep into what pros\cons you can get from this, since this is not related to the question.

However, I will say that the iteration using Iterable.forEach is inspired by FP and rather result of bringing more FP to Java (ironically, I'd say that there is no much use for forEach in pure FP, since it does nothing except introducing side-effects).

In the end I would say that it is rather a matter of taste\style\paradigm you are currently writing in.

About parallelism.

From performance point of view there is no promised notable benefits from using Iterable.forEach over foreach(...).

According to official docs on Iterable.forEach :

Performs the given action on the contents of the Iterable, in the order elements occur when iterating, until all elements have been processed or the action throws an exception.

... i.e. docs pretty much clear that there will be no implicit parallelism. Adding one would be LSP violation.

Now, there are "parallell collections" that are promised in Java 8, but to work with those you need to me more explicit and put some extra care to use them (see mschenk74's answer for example).

BTW: in this case Stream.forEach will be used, and it doesn't guarantee that actual work will be done in parallell (depends on underlying collection).

UPDATE: might be not that obvious and a little stretched at a glance but there is another facet of style and readability perspective.

First of all - plain old forloops are plain and old. Everybody already knows them.

Second, and more important - you probably want to use Iterable.forEach only with one-liner lambdas. If "body" gets heavier - they tend to be not-that readable. You have 2 options from here - use inner classes (yuck) or use plain old forloop. People often gets annoyed when they see the same things (iteratins over collections) being done various vays/styles in the same codebase, and this seems to be the case.

Again, this might or might not be an issue. Depends on people working on code.

Eugene Loy
  • 12,224
  • 8
  • 53
  • 79
  • 1
    Parallelism doesn't need new "parallel collections". It just depends on whether you asked for a sequantial stream (using collection.stream()) or for a parallel one (using collection.parallelStream()). – JB Nizet May 19 '13 at 17:17
  • @JBNizet According to docs Collection.parallelStream() does not guarantee that implementing collection will return parallell stream. I, am actually wondering myself, when this might happen, but, probably this do depend on collection. – Eugene Loy May 19 '13 at 17:52
  • agreed. It also depends on the collection. But my point was that parallel foreach loops were already available with all the standard collections (ArrayList, etc.). No need to wait for "parallel collections". – JB Nizet May 19 '13 at 18:06
  • @JBNizet agree on on your point, but that's not really what I meant by "parallel collections" in the first place. I reference Collection.parallelStream() which was added in Java 8 as "parallel collections" by the analogy to Scala's concept that does pretty much the same. Also, not sure how is it called in JSR's bit I saw couple of papers that use the same terminology for this Java 8 feature. – Eugene Loy May 19 '13 at 18:21
  • I wouldn't use the term "collection" for a parallel stream. Stream doesn't extend Collection, and doesn't contain anything. It's just a pipeline of operations. – JB Nizet May 19 '13 at 18:29
  • 1
    for the last paragraph you can use a function reference: `collection.forEach(MyClass::loopBody);` – ratchet freak May 19 '13 at 19:16
  • In English, the slash ("/") is used instead of the backslash ("\") – Aleksandr Dubinsky Oct 14 '13 at 22:10
6

One of most upleasing functional forEach's limitations is lack of checked exceptions support.

One possible workaround is to replace terminal forEach with plain old foreach loop:

    Stream<String> stream = Stream.of("", "1", "2", "3").filter(s -> !s.isEmpty());
    Iterable<String> iterable = stream::iterator;
    for (String s : iterable) {
        fileWriter.append(s);
    }

Here is list of most popular questions with other workarounds on checked exception handling within lambdas and streams:

Java 8 Lambda function that throws exception?

Java 8: Lambda-Streams, Filter by Method with Exception

How can I throw CHECKED exceptions from inside Java 8 streams?

Java 8: Mandatory checked exceptions handling in lambda expressions. Why mandatory, not optional?

Community
  • 1
  • 1
Vadzim
  • 24,954
  • 11
  • 143
  • 151
2

The advantage of Java 1.8 forEach method over 1.7 Enhanced for loop is that while writing code you can focus on business logic only.

forEach method takes java.util.function.Consumer object as an argument, so It helps in having our business logic at a separate location that you can reuse it anytime.

Have look at below snippet,

  • Here I have created new Class that will override accept class method from Consumer Class, where you can add additional functionility, More than Iteration..!!!!!!

    class MyConsumer implements Consumer<Integer>{
    
        @Override
        public void accept(Integer o) {
            System.out.println("Here you can also add your business logic that will work with Iteration and you can reuse it."+o);
        }
    }
    
    public class ForEachConsumer {
    
        public static void main(String[] args) {
    
            // Creating simple ArrayList.
            ArrayList<Integer> aList = new ArrayList<>();
            for(int i=1;i<=10;i++) aList.add(i);
    
            //Calling forEach with customized Iterator.
            MyConsumer consumer = new MyConsumer();
            aList.forEach(consumer);
    
    
            // Using Lambda Expression for Consumer. (Functional Interface) 
            Consumer<Integer> lambda = (Integer o) ->{
                System.out.println("Using Lambda Expression to iterate and do something else(BI).. "+o);
            };
            aList.forEach(lambda);
    
            // Using Anonymous Inner Class.
            aList.forEach(new Consumer<Integer>(){
                @Override
                public void accept(Integer o) {
                    System.out.println("Calling with Anonymous Inner Class "+o);
                }
            });
        }
    }
    
Hardik Patel
  • 1,033
  • 1
  • 12
  • 16