Problem:
I am attempting to train a Prediction IO project using Spark 1.6.1 and PredictionIO 0.9.5, but the job fails immediately after the Executors begin to work. This happens both in a Stand-Alone spark cluster and a Mesos cluster. In both cases I am deploying to the cluster from a remote client i.e. I am running pio train -- --master [master on some other server]
.
Symptoms:
- In the driver logs, shortly after the first
[Stage 0:> (0 + 0) / 2]
message, the executors die due tojava.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.protobuf.ProtobufUtil
Investigation:
Found the class-in-question within the
pio-assembly
jar:jar -tf pio-assembly-0.9.5.jar | grep ProtobufUtil org/apache/hadoop/hbase/protobuf/ProtobufUtil$1.class org/apache/hadoop/hbase/protobuf/ProtobufUtil.class
- When submitting, this jar is deployed with the project and can be found within the executors
- Adding
--jars pio-assembly-0.9.5.jar
topio train
does not fix the problem - Creating an uber jar with
pio build --clean --uber-jar
does not fix the problem - Setting
SPARK_CLASSPATH
on the slaves to a local copy ofpio-assembly-0.9.5.jar
does solve the problem
As far as I am aware, SPARK_CLASSPATH
is deprecated and should be replaced with --jars
when submitting. I'd rather not be dependant on a deprecated feature. Is there something I am missing when calling pio train
or with my infrastructure? Is there a defect (e.g. race condition) with the executors fetching the dependencies from the driver?