I have 3 categories, rated by 3 annotators each. In 52% of the cases, the 3 annotators agreed on the same category and in 43% two annotators agreed on one category and in only 5% of the times, each annotator chose a different category.
I calculate fleiss's kappa or krippendorff, but the value for krippendorff is lower than the fleiss, much lower, it's 0.032 while my fleiss is 0.49.
Isn't the agreement too low, especially using krippendorff?