Apache Spark - java.lang.NoSuchMethodError: breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf

Question

Here is the error:

Exception in thread "main" java.lang.NoSuchMethodError: breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf;
at org.apache.spark.ml.knn.Leaf$$anonfun$4.apply(MetricTree.scala:95)
at org.apache.spark.ml.knn.Leaf$$anonfun$4.apply(MetricTree.scala:93)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:57)
at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:66)
at scala.collection.mutable.ArrayBuffer.foldLeft(ArrayBuffer.scala:48)
at org.apache.spark.ml.knn.Leaf$.apply(MetricTree.scala:93)
at org.apache.spark.ml.knn.MetricTree$.build(MetricTree.scala:169)
at org.apache.spark.ml.knn.KNN.fit(KNN.scala:388)
at org.apache.spark.ml.classification.KNNClassifier.train(KNNClassifier.scala:109)
at org.apache.spark.ml.classification.KNNClassifier.fit(KNNClassifier.scala:117)
at SparkKNN$.main(SparkKNN.scala:23)
at SparkKNN.main(SparkKNN.scala)

Here is the program that is triggering the error:

object SparkKNN {
    def main(args: Array[String]) {
        val spark = SparkSession.builder().master("local").config("spark.sql.warehouse.dir", "file:///c:/tmp/spark-warehouse").getOrCreate()
        val sc = spark.sparkContext
        import spark.implicits._
        //read in raw label and features
        val training = spark.read.format("com.databricks.spark.csv").option("header", true).load("E:/Machine Learning/knn_input.csv")
        var df = training.selectExpr("cast(label as double) label", "cast(feature1 as int) feature1","cast(feature2 as int) feature2","cast(feature3 as int) feature3")
        val assembler = new VectorAssembler().setInputCols(Array("feature1","feature2","feature3")).setOutputCol("features")
        df = assembler.transform(df)
        //MLUtils.loadLibSVMFile(sc, "C:/Program Files (x86)/spark-2.0.0-bin-hadoop2.7/data/mllib/sample_libsvm_data.txt").toDF()

        val knn = new KNNClassifier()
            .setTopTreeSize(df.count().toInt / 2)
            .setK(10)
        val splits = df.randomSplit(Array(0.7, 0.3))
        val (trainingData, testData) = (splits(0), splits(1))
        val knnModel = knn.fit(trainingData)

        val predicted = knnModel.transform(testData)
        predicted.show()
    }
}

I am using Apache spark 2.0 with scala version 2.11.8. It looks like a version difference issue. Any ideas?

Vidya · Answer 1 · 2017-04-02T16:15:12.350

1

Spark MLLib 2.0 brings in this version of Breeze:

"org.scalanlp" % "breeze_2.11" % "0.11.2"

You must have another library in your classpath that also has a dependency on Breeze but a different version, and that's the one being loaded. As a result, MLLib is operating with a different version of Breeze at runtime than was around at compile-time.

You have multiple options. You can find that undesirable transitive dependency on Breeze and exclude it. You can add a direct dependency on the version of that library that has the same Breeze dependency MLLib does. Or you can add a direct dependency on the Breeze version MLLib needs.

edited Apr 02 '17 at 16:15

answered Apr 02 '17 at 16:01

Vidya

29,932
7
42
70

RE: Spark 2.1 MLib depends on breeze 0.12 https://github.com/apache/spark/blob/v2.1.0/pom.xml#L655-L671 – Anthony Dotterer Apr 04 '17 at 19:46
The OP is using 2.0. Are you saying that upgrading Spark will solve the problem? – Vidya Apr 04 '17 at 19:55

Apache Spark - java.lang.NoSuchMethodError: breeze.linalg.Vector$.scalarOf()Lbreeze/linalg/support/ScalarOf

1 Answers1

Linked