12

Running the following code sample ends with:
"Exception in thread "main" java.lang.StackOverflowError"

import java.util.stream.IntStream;
import java.util.stream.Stream;

public class TestStream {

    public static void main(String[] args) {
        Stream<String> reducedStream = IntStream.range(0, 15000)
            .mapToObj(Abc::new)
            .reduce(
                Stream.of("Test")
                , (str , abc) -> abc.process(str)
                , (a , b) -> {throw new IllegalStateException();}
        );
        System.out.println(reducedStream.findFirst().get());
    }

    private static class Abc { 
        public Abc(int id) {
        }

        public Stream<String> process(Stream<String> batch) {
            return batch.map(this::doNothing);
        }

        private String doNothing(String test) {
            return test;
        }
    }
}

What exactly is causing that issue? Which part of this code is recursive and why?

Tunaki
  • 132,869
  • 46
  • 340
  • 423
slowikps
  • 181
  • 1
  • 8
  • 4
    Why are reducing a Stream into another Stream? – Tunaki Apr 15 '16 at 20:13
  • there is no recursive call and works fine on my machine – Andrew Tobilko Apr 15 '16 at 20:15
  • 1
    The problem comes from the 15000 calls to `map`. Internally, it must chain them somehow. You can reproduce it without `Abc`: `IntStream.range(0, 15000).boxed().reduce(Stream.of("Test"), (str , abc) -> str.map(s -> s), (a , b) -> {throw new IllegalStateException();}` – Tunaki Apr 15 '16 at 20:17
  • 4
    The stack trace from the `StackOverflowError` should reveal the cycle of methods calling each other. Please the revelant part of it in your post. – rgettman Apr 15 '16 at 20:19
  • 2
    Even shorter: `Stream st = Stream.of("Test"); for (int i = 0; i < 15000; i++) st = st.map(s -> s); st.findFirst();` I don't know if we can qualify this a bug though. Each map operation is creating a temporary object wrapping the current Stream. And then when traversing begins, it must go through all those indirections. And it appears to be too much. – Tunaki Apr 15 '16 at 20:31
  • 3
    Note: this error is also mentioned in this ticket: https://bugs.openjdk.java.net/browse/JDK-8025523: by Paul Sandoz *Note that this will not stop all stack overflows, and it is possible to create stack overflows in other ways, for example creating a very long pipeline, say Stream.of(1) with many map(Function.identity()) operations with a terminal operation such as toArray(). In that respect it may be appropriate to sprinkle some implementation notes in the documentation.* – Tunaki Apr 15 '16 at 21:23
  • This is simplify example. The main idea was that every component can produce set of results based on previous component results. Final answer is first element in the stream, when processed after all components - this extraction was omitted as it was irrelevant for the question. Thanks @Tunaki for pointing me to that ticket. – slowikps Apr 16 '16 at 14:36
  • I tried a non-Stream alternative that called Function.andThen() 15000 times instead of 15000 .map() operations in a Stream and still got the stack overflow – Hank D Apr 16 '16 at 17:39

1 Answers1

4

Your code isn't recursively looping. You can test with smaller numbers for the IntStream range (i.e. 1 or 100). In your case it's the actual stack size limit that causes the problem. As pointed out in some of the comments, its the way the streams are processes.

Each invocation on the stream creates a new wrapper stream around the original one. The 'findFirst()' method asks the previous stream for elements, which in turn asks the previous stream for elements. As the streams are no real containers but only pointers on the elements of the result.

The wrapper explosion happens in the reduce methods' accumulator '(str , abc) -> abc.process(str)'. The implementation of the method creates a new stream wrapper on the result (str) of the previous operation, feeding into the next iteration, creating a new wrapper on the result(result(str))). So the accumulation mechanism is one of a wrapper (recursion) and not of an appender (iteration). So creating a new stream of the actual (flattened) result and not on reference to the potential result would stop the explosion, i.e.

public Stream<String> process(Stream<String> batch) {
        return Stream.of(batch.map(this::doNothing).collect(Collectors.joining()));
    }

This method is just an example, as your original example doesn't make any sense because it does nothing, and neither does this example. Its just an illustration. It basically flattens the elements of the stream returned by the map method into a single string and creates a new stream on this concrete string and not on a stream itself, thats the difference to your original code.

You could tune the stacksize using the '-Xss' parameter which defines the size of the stack per thread. The default value depends on the platform, see also this question 'What is the maximum depth of the java call stack?' But take care when increasing, this setting applies to all threads.

Community
  • 1
  • 1
Gerald Mücke
  • 10,724
  • 2
  • 50
  • 67
  • Thanks. I need to spend a bit more time debugging Java streams. This works: `return Stream.of(batch.map(this::doNothing).collect(Collectors.joining()));` but this is very strange line of code :) – slowikps Apr 20 '16 at 19:57