ratings<-data.frame(User=c("John","Maria","Anton","Roger","Martina","Ana","Sergi","Marc","Jim","Chris")
,Star.Wars.IV...A.New.Hope=c(1,5,NA,NA,4,2,NA,4,5,4)
,Star.Wars.VI...Return.of.the.Jedi=c(5,3,NA,3,3,4,NA,NA,1,2)
,Forrest.Gump=c(2,NA,NA,NA,4,4,3,NA,NA,2)
)
ratings
User Star.Wars.IV...A.New.Hope Star.Wars.VI...Return.of.the.Jedi Forrest.Gump
1 John 1 5 2
2 Maria 5 3 NA
3 Anton NA NA NA
4 Roger NA 3 NA
5 Martina 4 3 4
6 Ana 2 4 4
7 Sergi NA NA 3
8 Marc 4 NA NA
9 Jim 5 1 NA
10 Chris 4 2 2
If you want to include the NA
s in the total ratings count:
colSums(ratings[,-1]>=4,na.rm=T)/nrow(ratings)
Star.Wars.IV...A.New.Hope Star.Wars.VI...Return.of.the.Jedi Forrest.Gump
0.5 0.2 0.2
If you want to exclude the NA
s from the total ratings count:
colMeans(ratings[,-1]>=4,na.rm=T)
Star.Wars.IV...A.New.Hope Star.Wars.VI...Return.of.the.Jedi Forrest.Gump
0.7142857143 0.2857142857 0.4000000000