12

I am doing a multiclass prediction with random forest in Spark ML.

For this MulticlassClassificationEvaluator() in spark ML, is it possible to get precision/recall by each class labels?

Currently, I am only seeing precision/recall combined for all class together.

user4157124
  • 2,809
  • 13
  • 27
  • 42
Sam
  • 121
  • 1
  • 5

2 Answers2

1

Use directly org.apache.spark.mllib.evaluation.MulticlassMetrics and then get metrics available-

// copied from spark git
val predictionAndLabels =
      dataset.select(col($(predictionCol)), col($(labelCol)).cast(DoubleType)).rdd.map {
        case Row(prediction: Double, label: Double) => (prediction, label)
      }
    val metrics = new MulticlassMetrics(predictionAndLabels)
Som
  • 6,193
  • 1
  • 11
  • 22
0

Looking at the class documentation this doesn't seem to be possible, using the built-in methods.

Although not exactly what you are looking for, you could use weightedPrecisionand weightedRecall in the metricName method. This will at least account for class imbalances.

KT12
  • 549
  • 11
  • 24
Owlright
  • 170
  • 1
  • 5