How Internal iteration looks like for java 8 streams

Question

I was trying to understand the difference between external vs internal iterator where external iterator uses Iterator to enumerate its elements

List<String> alphabets = Arrays.asList(new String[]{"a","b","b","d"});
         
for(String letter: alphabets){
   System.out.println(letter.toUpperCase());
}

Above code in the background does something like below

List<String> alphabets = Arrays.asList(new String[]{"a","b","b","d"});    
Iterator<String> iterator = alphabets.listIterator();
while(iterator.hasNext()){
     System.out.println(iterator.next().toUpperCase());
}

But for internal iteration, everything is done in the background which is a black box to me and I want to get deep dive into it.

Like for below code, iteration is happening in the background but exactly what is happening and how it is different when compare with foreach loop?

List<String> alphabets = Arrays.asList(new String[]{"a","b","b","d"});
alphabets.stream().forEach(l -> l.toUpperCase());

This is my understanding of external and internal iteration. Please correct me if I am wrong.

You haven't shown a stream anywhere in this, just iterators. — chrylis -cautiouslyoptimistic-, Nov 03 '21 at 04:52
Have you tried looking at the source code of `Collection.forEach`? That would be the first place to start. — ernest_k, Nov 03 '21 at 05:03
Here's a link to the source that ernest_k talks about: [Iterable.forEach(Consumer)](https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Iterable.java#L72). That's the default implementation; some implementations of `Iterable` may do something a little differently. — Slaw, Nov 03 '21 at 05:39
Hi @ernest_k, do you mean that internal iteration is done with Spliterator by streams? — Dan, Nov 03 '21 at 08:24

score 7 · Accepted Answer · answered Nov 03 '21 at 12:17

The whole point of internal iteration is that the iteration logic can be implemented in different ways.

The default implementation of Iterable looks like

default void forEach(Consumer<? super T> action) {
    Objects.requireNonNull(action);
    for (T t : this) {
        action.accept(t);
    }
}

So when an iterable, e.g. a collection, doesn’t provide a forEach implementation, this inherited implementation does the same as the external iteration.

There are different reasons to override this default implementation. For example, the wrappers returned by one of the Collections.synchronized…(…) methods have an implementation that looks like this:

@Override
public void forEach(Consumer<? super E> consumer) {
    synchronized (mutex) {c.forEach(consumer);}
}

It delegates to the original collection’s forEach implementation, to do whatever it does, but performs the entire operation while holding the mutex. In other words, unlike external iteration, where the code has to care about locking itself, this implementation provides the synchronization for free.

Another interesting example is in the Collections.unmodifiable…(…) wrappers.

@Override
public void forEach(Consumer<? super E> action) {
    c.forEach(action);
}

The advantage is not obvious at the first glance. But, as you’ve shown in the question, an external iteration is done using an Iterator, even when using the for-each syntax. An Iterator has a remove method, so each time you perform an external iteration over an unmodifiable wrapper, the original collection’s iterator must be wrapped in another iterator preventing callers from using remove().

In contrast, the internal iteration is not modifying the collection by contract. So the wrapper can delegate to the original collection’s forEach method which will do the right thing. No wrapper around the iterator is required, if the target method uses an iterator at all.

E.g. this is ArrayList’s forEach method:

@Override
public void forEach(Consumer<? super E> action) {
    Objects.requireNonNull(action);
    final int expectedModCount = modCount;
    @SuppressWarnings("unchecked")
    final E[] elementData = (E[]) this.elementData;
    final int size = this.size;
    for (int i=0; modCount == expectedModCount && i < size; i++) {
        action.accept(elementData[i]);
    }
    if (modCount != expectedModCount) {
        throw new ConcurrentModificationException();
    }
}

So instead of an iterator based loop, it iterates over the indices and accesses its array directly.

Note that calling forEach on an iterable like a Collection is different to calling forEach on a Stream. A stream allows to chain various intermediate operations affecting what will happen when the terminal operation is commenced. When you don’t chain intermediate operations but just call forEach(…), the call may perform the equivalent to collection.spliterator() .forEachRemaining(…) at the stream implementation’s discretion.

This ends up at a different code path than calling forEach on the collection, but for reasonable collection implementations, these code paths will do basically the same. This answer compares the different variants for ArrayList and while there are minor differences, the fundamental way of iterating (using an index into the array) is the same.

How Internal iteration looks like for java 8 streams

1 Answers1