0

I did this code and I get always this error on the line

val randomForestModel = randomForestClassifier.fit(trainingData)

the code:

val seed = 5043
val Array(trainingData, testData) = labelDf.randomSplit(Array(0.7, 0.3), seed)
trainingData.cache()
testData.cache()

// train Random Forest model with training data set

 val randomForestClassifier = new RandomForestClassifier()
.setImpurity("gini")
.setMaxDepth(3)
.setNumTrees(20)
.setFeatureSubsetStrategy("auto")
.setSeed(seed)

val randomForestModel = randomForestClassifier.fit(trainingData)

println(randomForestModel.toDebugString)

The error :

ERROR Instrumentation: org.apache.spark.SparkException: Task not serializable
  • maybe problem in scala version, did you use scala 2.12.8? https://stackoverflow.com/questions/60531764/spark-dataframe-stat-throwing-task-not-serializable#comment107090455_60531764 – Boris Azanov Mar 08 '20 at 09:37
  • yes I use scala 2.12 –  Mar 09 '20 at 09:03

0 Answers0