9

I haven't used R in a while, so maybe I'm just not used to it yet, but.. I have a table in R with two colums, the first one has predicted values (a value can be either 0 or 1), the second one has the actual values (also 0 or 1). I need to find recall, precision and f-measures, but cannot find a good function for it in R. (I also read about ROCR, but all I could do was creating some plots, but I really don't need plots, I need the numbers).

Is there any good functions for finding precision, recall and f-measure in R? Are there any different ways to do it?

user2314737
  • 27,088
  • 20
  • 102
  • 114
Fanny
  • 310
  • 3
  • 4
  • 8

4 Answers4

23

First I create a data set as

> predict <- sample(c(0, 1), 20, replace=T)
> true <- sample(c(0, 1), 20, replace=T)

I suppose those 1's in the predicted values are the retrieved. The total number of retrieved is

> retrieved <- sum(predict)

Precision which is the fraction of retrieved instances that are relevant, is

> precision <- sum(predict & true) / retrieved

Recall which is the fraction of relevant instances that are retrieved, is

> recall <- sum(predict & true) / sum(true)

F-measure is 2 * precision * recall / (precision + recall) is

> Fmeasure <- 2 * precision * recall / (precision + recall)
JACKY88
  • 3,391
  • 5
  • 32
  • 48
  • Thanks, it worked really well! (and was mush simpler than I thought, i guess I was overthinking again) – Fanny Sep 25 '12 at 18:21
  • 3
    When predict is all 0 (the classifier predicted all samples as belonging to class 0), then retrieved = 0 and you divide by 0 – Omri374 Nov 28 '13 at 12:45
10

Just packaging Patrick's great answer neatly into a function ...

measurePrecisionRecall <- function(predict, actual_labels){
  precision <- sum(predict & actual_labels) / sum(predict)
  recall <- sum(predict & actual_labels) / sum(actual_labels)
  fmeasure <- 2 * precision * recall / (precision + recall)

  cat('precision:  ')
  cat(precision * 100)
  cat('%')
  cat('\n')

  cat('recall:     ')
  cat(recall * 100)
  cat('%')
  cat('\n')

  cat('f-measure:  ')
  cat(fmeasure * 100)
  cat('%')
  cat('\n')
}
BGA
  • 563
  • 6
  • 14
6

You can get all these metrics with the function confusionMatrix() from the caret package.

# Create a sample
predicted <- as.factor(sample(c(0, 1), 100, replace=T))
realized  <- as.factor(sample(c(0, 1), 100, replace=T))

# Compute the confusion matrix and all the statistics
result <- confusionMatrix(predicted, realized, mode="prec_recall")

result
result$byClass["Precision"]
result$byClass["Recall"]
result$byClass["F1"]
adDS
  • 61
  • 1
  • 4
  • Uhm.. These are the metrics I can obtain with those commands: Sensitivity,Specificity,Prevalence,PPV,NPV ,Detection Rate,Detection Prevalence,Balanced Accuracy. I assume you are using a different version of the Caret package. Maybe someone else have a different version too. – Ale Aug 20 '20 at 07:22
  • I checked and I had Caret 2.1-2. I updated my old package to 6.0-86 and F1 score was there. Thanks – Ale Aug 20 '20 at 07:35
1
measurePrecisionRecall <- function(actual_labels, predict){
  conMatrix = table(actual_labels, predict)
  precision <- conMatrix['0','0'] / ifelse(sum(conMatrix[,'0'])== 0, 1, sum(conMatrix[,'0']))
  recall <- conMatrix['0','0'] / ifelse(sum(conMatrix['0',])== 0, 1, sum(conMatrix['0',]))
  fmeasure <- 2 * precision * recall / ifelse(precision + recall == 0, 1, precision + recall)

  cat('precision:  ')
  cat(precision * 100)
  cat('%')
  cat('\n')

  cat('recall:     ')
  cat(recall * 100)
  cat('%')
  cat('\n')

  cat('f-measure:  ')
  cat(fmeasure * 100)
  cat('%')
  cat('\n')
}
Billa
  • 59
  • 5