4

I used the irr package from R to calculate a Fleiss kappa statistic for 263 raters that judged 7 photos (scale 1 to 7). kappam.fleiss(db) delivered the kappa statistic (0.554; z=666) and the p-value (0), but unfortunately there is no confidence interval for the kappa statistic included. Can anybody help me out on how I can get the confidence interval ?

thx


Addition of example: row names/ rater.1 / rater.2 / rater.3 / rater.4 / rater.5 / ..../ rater.263 photo 1 / 6 / 6 / 6 / 6 / 7 / ... / 5 photo 2 / 1 / 2 / 1 / 1 / 1 / ... / 2 photo 3 / 5 / 5 / 5 / 5 / 6 / ... / 6 photo 4 / 3 / 1 / 3 / 3 / 3 / ... / 1 photo 5 / 2 / 3 / 2 / 2 / 2 / ... / 3 photo 6 / 4 / 4 / 4 / 4 / 4 / ... / 4 photo 7 / 7 / 7 / 7 / 7 / 5 / ... / 7

koen_huys
  • 59
  • 1
  • 3

2 Answers2

3

A confidence interval is not provided by the irr package. It is possible that you could calculate it from one of the test statistics which can be obtained (if so, as 42 said, that's a question for Cross Validated).

However, this is provided by the raters package.

library(raters) 
data(diagnostic)
concordance(diagnostic,test="Chisq")
concordance(diagnostic,test="Normal")
concordance(diagnostic,test="MC",B=100)
Inter-rater Agreement 
$Fleiss
      Kappa         LCL         UCL   Std.Error     Z value    Pr(>|z|) 
 0.43024452  0.38247249  0.47801655  0.02437393 17.65183058  0.00000000 

$Statistic
        S       LCL       UCL    pvalue 
0.4444444 0.3555556 0.5404861 0.0000000

https://cran.r-project.org/web/packages/raters/raters.pdf

Hack-R
  • 22,422
  • 14
  • 75
  • 131
  • So it was really a request to do searching. – IRTFM Nov 28 '16 at 01:19
  • 1
    Thanks for your response! It gives however completely different values ? I have edited the original post with an example of the data. Hope you can help! – koen_huys Dec 11 '16 at 00:06
  • 1
    Also finding anothe value with `raters` than with `irr`. According to https://cran.r-project.org/web/packages/raters/raters.pdf their implementation is a modified version. Disconcertingly different outcomes though. – alle_meije Dec 19 '17 at 13:01
3

The difference between the kappam.fleiss function and the concordance function is that the first is for detailed raters, and the second is for summary data. look at the following example from the Wikipedia page:

DATA <- data.frame(Rater1 = c(5, 2, 3, 2, 1, 1, 1, 1, 1, 2))
DATA$Rater2 <- c(5, 2, 3, 2, 1, 1, 1, 1, 1, 2)
DATA$Rater3 <- c(5, 3, 3, 2, 2, 1, 1, 2, 1, 3)
DATA$Rater4 <- c(5, 3, 4, 3, 2, 1, 2, 2, 1, 3)
DATA$Rater5 <- c(5, 3, 4, 3, 3, 1, 2, 2, 1, 4)
DATA$Rater6 <- c(5, 3, 4, 3, 3, 1, 3, 2, 1, 4)
DATA$Rater7 <- c(5, 3, 4, 3, 3, 1, 3, 2, 2, 4)
DATA$Rater8 <- c(5, 3, 4, 3, 3, 2, 3, 3, 2, 5)
DATA$Rater9 <- c(5, 4, 5, 3, 3, 2, 3, 3, 2, 5)
DATA$Rater10 <- c(5, 4, 5, 3, 3, 2, 3, 3, 2, 5)
DATA$Rater11 <- c(5, 4, 5, 3, 3, 2, 3, 4, 2, 5)
DATA$Rater12 <- c(5, 4, 5, 3, 3, 2, 4, 4, 3, 5)
DATA$Rater13 <- c(5, 5, 5, 4, 4, 2, 4, 5, 3, 5)
DATA$Rater14 <- c(5, 5, 5, 4, 5, 2, 4, 5, 4, 5)

library("irr")
kappam.fleiss(DATA)

TABLE <- matrix(rep(0, 50), nrow = 10)
for (COLUMN in 1:5) {
  for (ROW in 1:10) {
    TABLE[ROW, COLUMN] <- sum(DATA[ROW,] == COLUMN)
  }
}


library(raters) 
concordance(db = TABLE, test="Normal")
Yishai
  • 31
  • 1