2

I need to create a service that parses information from paged websites and returns an iterator of parsed info.

To do so, I often use streams to chain parsing together. However, I've noticed that if one calls iterator() on a java stream that features a flatmap call, each stream that is flat mapped is read fully before the first iteration is returned. If one of the streams takes a long time to complete, or is infinite, the final iterator will never return an iteration.

Is this by design? Should I be doing something differently? Have a look at the below sample code. Note how the output changes when using foreach() vs iterator().

package temp;

import java.util.Arrays;
import java.util.Iterator;
import java.util.concurrent.ThreadLocalRandom;
import java.util.function.Supplier;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;

import com.google.common.collect.AbstractIterator;

public class StreamTest {

    public static void main(String[] args) {
        // set first iterable max index randomly
        int MAX_INDEX = ThreadLocalRandom.current().nextInt(10, 20 + 1);
        System.out.println("max index: " + MAX_INDEX);
        // create slow iterable
        Iterable<String> iterable1 = () -> new AbstractIterator<String>() {

            private int index = -1;

            @Override
            protected String computeNext() {
                index++;
                if (index >= MAX_INDEX) {
                    return this.endOfData();
                }
                System.out.println("dummy computing index: " + index);
                try {
                    Thread.sleep(500);
                } catch (InterruptedException e) {
                    throw new java.lang.RuntimeException(e);
                }
                return "iterable " + index;
            }
        };
        // create list
        Iterable<String> iterable2 = Arrays.asList("list index 1", "list index 2", "list index 3");
        // create a stream supplier
        Supplier<Stream<String>> streamSupplier = () -> Arrays.asList(iterable1, iterable2).stream()
                .flatMap(i -> StreamSupport.stream(i.spliterator(), false));
        // print using for each
        System.out.println("\n***testing for each***");
        streamSupplier.get().forEach(str -> {
            System.out.println("for each - " + str);
        });
        System.out.println("\n***testing iterator***");
        Iterator<String> iter = streamSupplier.get().iterator();
        while (iter.hasNext()) {
            System.out.println("iterator - " + iter.next());
        }
    }

}

Here is the output from the above:

max index: 12

***testing for each***
dummy computing index: 0
for each - iterable 0
dummy computing index: 1
for each - iterable 1
dummy computing index: 2
for each - iterable 2
dummy computing index: 3
for each - iterable 3
dummy computing index: 4
for each - iterable 4
dummy computing index: 5
for each - iterable 5
dummy computing index: 6
for each - iterable 6
dummy computing index: 7
for each - iterable 7
dummy computing index: 8
for each - iterable 8
dummy computing index: 9
for each - iterable 9
dummy computing index: 10
for each - iterable 10
dummy computing index: 11
for each - iterable 11
for each - list index 1
for each - list index 2
for each - list index 3

***testing iterator***
dummy computing index: 0
dummy computing index: 1
dummy computing index: 2
dummy computing index: 3
dummy computing index: 4
dummy computing index: 5
dummy computing index: 6
dummy computing index: 7
dummy computing index: 8
dummy computing index: 9
dummy computing index: 10
dummy computing index: 11
iterator - iterable 0
iterator - iterable 1
iterator - iterable 2
iterator - iterable 3
iterator - iterable 4
iterator - iterable 5
iterator - iterable 6
iterator - iterable 7
iterator - iterable 8
iterator - iterable 9
iterator - iterable 10
iterator - iterable 11
iterator - list index 1
iterator - list index 2
iterator - list index 3

Shouldn't iterator() and foreach() have the same output?

Stefan Zobel
  • 3,182
  • 7
  • 28
  • 38
regbo
  • 41
  • 3
  • 1
    Welcome to StackOverflow. This is a well-written first question, so +1. But it's also a dup of https://stackoverflow.com/a/46291806/18157, which took me a few minutes to find. – Jim Garrison Feb 12 '18 at 18:42

0 Answers0