Let's say I'm tracking the answers to 100 problems on a quiz taken by 1000 students. The majority of students give one of four or five wrong answers, but a larger number also give wildly incorrect answers that very few other students give. How can I turn all those wildly wrong answers into a new value of "wildly wrong" while keeping the other ones? For the sake of example, let's say for one question 200 students give one answer, 150 give a second, 160 give a third, and 490 give something that no one else gave. For another, 80 students gave one answer, 50 a second, and 30 a third, but 840 gave something no one else gave. I want to turn the 490 for one question and 840 for the other all into "wildly wrong".
I looked at purr, but I think I'm missing something that could automate given that I want say top 3 answers to remain unchanged but the rest changed.
Shortening the numbers for the sake of example:
a1 <- c("rna", "rna", "dna", "dna", "cell", "cell", "cell", "hair", "nail", "finger", "toe", "scallop", "brow", "mitosis", "my toes is")
a2 <- c("darwin", "darwin", "darwin", "einstein", "einstein", "einstein", "einstein", "pollack", "newton", "leibniz", "johnson", "no idea", "you", "me", "no one")
a3 <- c("5.5", "5.5", "5.6", "5.5", "5.4", "5.2", "5.4", "5.6", "2", "3", "1", "-1", "5.5", "-5.5", "72.4")
df <- data.frame(a1, a2, a3)
Afterwards, I'm trying to get:
> plyr::count(df$a1)
1 cell 3
2 dna 2
3 rna 2
4 wild 8
> plyr::count(df$a2)
1 darwin 3
2 einstein 4
3 wild 8
> plyr::count(df$a3)
1 5.4 2
2 5.5 4
3 5.6 2
4 the rest 7