0

pio train (after a successful pio build) gives me an error like that:

[ERROR] [Executor] Exception in task 0.0 in stage 39.0 (TID 34)
[WARN] [TaskSetManager] Lost task 0.0 in stage 39.0 (TID 34, localhost): java.lang.StackOverflowError
at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2321)
at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2614)
at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2624)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1321)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)

From here, the ObjectInputStreams etc. repeat more or less till the stack is full.

Does anyone have a hint what this might be or how to debug this?

NB: I'm running prediction.io inside a docker container which might (?) cause the problem, but again: I do not really know how to go on from there.

Any help is truly appreciated.

PS: I increased the stacksize using SPARK_DAEMON_JAVA_OPTS="-Xss=9m"with no effect, but I guess an infinite recursion is the culprit anyway.

Tommy
  • 739
  • 8
  • 24
  • which template are you using? – alex9311 Jun 30 '16 at 13:16
  • The [similarproduct](http://templates.prediction.io/PredictionIO/template-scala-parallel-similarproduct) template. What I didn't mention was, that a couple of weeks ago (when I last tried it) it worked fine. So _something_ changed. – Tommy Jun 30 '16 at 13:22

1 Answers1

2

A similar error showed up in my case (also using a Docker container). I found two ways to resolve the problem.

Option 1: Increase Prediction IO's driver memory

Use the --driver-memory flag:

pio train -- --driver-memory 2g

From the Tapster example:

[Use] the --driver-memory option to limit the memory used by Apache PredictionIO (incubating). Without this Apache PredictionIO (incubating) can consume too much memory leading to a crash.

Option 2: increase JVM memory

This can be accomplished by calling export JAVA_OPTS=-Xmx2g before pio train.

See What are the Xms and Xmx parameters when starting JVMs? for more detail on the JVM's memory options.

Community
  • 1
  • 1
Jonny5
  • 1,390
  • 1
  • 15
  • 41
  • I don't have the code at hand right now. But I find it interesting, that from the doc you quoted, `--driver-memory` _limits_ the memory usage. So (if the doc is right) the memory given to pio is reduced this way. – Tommy Jan 24 '17 at 10:16
  • It's not entirely clear to me. Is it possible that by using this flag, the algorithm limits its memory usage instead of increasing the reserved memory? But if I recall correctly, I reset the JVM memory to a value far less than 2GB, hence it seems that the allocated memory is also influenced. When I have some more time I'll try to get a bit deeper on this subject. – Jonny5 Jan 25 '17 at 15:08
  • I tried ` pio train -- --driver-memory 4g --conf spark.executor.cores=1`, and successfully – Cyanny Aug 21 '18 at 04:36