3

Problem:

I am attempting to train a Prediction IO project using Spark 1.6.1 and PredictionIO 0.9.5, but the job fails immediately after the Executors begin to work. This happens both in a Stand-Alone spark cluster and a Mesos cluster. In both cases I am deploying to the cluster from a remote client i.e. I am running pio train -- --master [master on some other server] .

Symptoms:

  • In the driver logs, shortly after the first [Stage 0:> (0 + 0) / 2] message, the executors die due to java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.protobuf.ProtobufUtil

Investigation:

  • Found the class-in-question within the pio-assembly jar:

    jar -tf pio-assembly-0.9.5.jar | grep ProtobufUtil
    org/apache/hadoop/hbase/protobuf/ProtobufUtil$1.class
    org/apache/hadoop/hbase/protobuf/ProtobufUtil.class
    
  • When submitting, this jar is deployed with the project and can be found within the executors
  • Adding --jars pio-assembly-0.9.5.jar to pio train does not fix the problem
  • Creating an uber jar with pio build --clean --uber-jar does not fix the problem
  • Setting SPARK_CLASSPATH on the slaves to a local copy of pio-assembly-0.9.5.jar does solve the problem

As far as I am aware, SPARK_CLASSPATH is deprecated and should be replaced with --jars when submitting. I'd rather not be dependant on a deprecated feature. Is there something I am missing when calling pio train or with my infrastructure? Is there a defect (e.g. race condition) with the executors fetching the dependencies from the driver?

Jake Greene
  • 5,539
  • 2
  • 22
  • 26

1 Answers1

1

The problem is that java.lang.NoClassDefFoundError: Could not initialize class doesn't actually mean that the dependency is not there, but rather it is a poorly named exception and the real problem is that the class loader had trouble loading the class. The actual problem will be reported in the form of java.lang.ExceptionInInitializerError which will likely be thrown from a static code block. It is hard to tell the difference betweenjava.lang.NoClassDefFoundError and java.lang.ClassNotFoundException, but the latter is what actually means that the dependency is missing (this question and others provide more details).

Community
  • 1
  • 1