I would like to expand on question: Find the index of the column in data frame that contains the string as value
I have data
data<-data.frame(expert=c("class.1","class.4","class.2"),
choice1=c("class.3","class.8","class.10"),
score1=c(0.92,0.91,0.30),
choice2=c("class.1","class.7","class.9"),
score2=c(0.70,0.78,0.30),
choice3=c("class.6","class.1","class.2"),
score3=c(0.01,0.58,0.30),
stringsAsFactors=FALSE
)
I would like to get the score associated with the expert choice. The goal is to find out if the 1.) choice one is correct, but I need to check if there are scores that the code chosen by the expert is tied.
So in the example data, using dplyr:
data %>% mutate(Right=expert==choice1)
get part of the answer, but doesn't handle ties. The answer in Find the index of the column in data frame that contains the string as value uses grepl, which I don't think can handle vectors of regex patterns.
I've tried max, max.col, and which, alone and in combination with rowwise(), but I just cant seem to get the right answer. I've also made the data "tidy" using reshape (thanks UCLA IDRE http://stats.idre.ucla.edu/r/faq/how-can-i-reshape-my-data-in-r/), but I was unable to filter the data appropriately.
tidydata <- reshape(data,varying =list(paste0("choice",1:3),
paste0("score",1:3)),direction="long",v.names=c("choice","score")) %>%
arrange(id) %>% filter(expert==choice)
I know the column of the expert choice, but lose the connection to the choice.1
The best solution would have a function that returns a factor (right, tie, wrong), where row 3 would return tie.
Edit: This data is comparing the results of a classifier to a human annotator. The classifier can sometimes yield tied results (the score for 2 or more classes are the same). I am trying to identify when the classifier is correct (choice1==expert), but not tied (I call this RIGHT); Tied (when classes selected by the expert and the classifier have the same score, but are not the same code I call this TIE); otherwise the classify is WRONG. Thank you