I'm making a rating survey in R (Shiny) and I'm tryng to find a metric that can evaluate the agreement but for only one of the "questions" in the survey. The ratings range from 1 to 5. There is multiple raters and each rater rates a set of 10 questions according to the ratings.
I've used Fleiss Kappa and Krippendorff Alpha for the whole set of questions and raters and it works but when evaluating each question separately these metrics give negative value. I tried calculating them by hand (formulas) and I still get the same results so I guess that they don't work for a small sample of subjects (in this case a sample of 1).
I've looked at other metrics like rwg in the multilevel package but thus far I can't seem to make it work. According to r documentation:
rwg(x, grpid, ranvar=2)
Where:
x = A vector representing the construct on which to estimate agreement.
grpid = A vector identifying the groups from which x originated.
Can someone explain me what the rwg function expects from me?
If someone know some other agreement metric that might work better please let me know.
Thanks.