AssertionError Fleiss' Kappa for IAA

Question

In my data frame

data = {'Query' : ['Carpal Ts', 'Dermallxde'],
        'Items' : [['Sir chucks', 'Oga Mathew', 'Tambe Charlse'],['Man Ond', 'kolofata Hil', 'Haruna mendy']],
        'Raters' : [[[0,0,0,1,1,0,0,1,0,1,0,0,0],[0,1,0,1,0,0,0,1,1,1,0,0,1],[1,1,1,1,1,1,1,1,1,1,1,1,1]],[[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,0,1,0,1,1,0,1,1,1,0],[1,1,1,1,1,1,1,1,1,1,1,1,1]]]}

results_df = pd.DataFrame(data)

The Raters rates either 0/1 for for each query rates Items as bad or good respectively.

I have tried to calculate the Fleiss' Kappa by creating an matrix array for the raters column and for each Item row i get the Fleiss' Kappa for that particular Query. Please see my attempt below

from statsmodels.stats.inter_rater import fleiss_kappa

# create an empty list to hold the Fleiss' Kappa values
kappas = []

# iterate over each index in the results dataframe
for idx, row in results_df.iterrows():
    data = row['Raters'] # get the list of lists in the 'Data' field
    users = row['Items'] # get the list of users with data
    
    # check if there are exactly three non-null columns with lists of equal length
    if len(data) == 3 and len(set([len(d) for d in data])) == 1:
        kappa = fleiss_kappa(np.array(data).T)
        print(kappa)
        kappas.append({
            'Query': idx,
            'Items': users,
            'Fleiss Kappa': kappa
        })

# create a new dataframe from the list of Fleiss' Kappa values
kappa_df = pd.DataFrame(kappas)

Any help on this Calculation will be much appreciated.

@ScottHunter no details as it returns nothing. i get "AssertionError: " any other way to get the kappa will be appreciated. — cndze, May 01 '23 at 15:49

Maria Tsfasman · Answer 1 · 2023-05-04T16:27:42.327

Had the same issue. If I understand your data correctly, you need to first convert data with shape (subject, rater) to (subject, cat_counts), as described here. So before computing kappa, convert the data to the (subject, cat_count) shape by using aggregate_raters() function. Hope this helps!

from statsmodels.stats.inter_rater import fleiss_kappa,aggregate_raters

# create an empty list to hold the Fleiss' Kappa values
kappas = []

# iterate over each index in the results dataframe
for idx, row in results_df.iterrows():
    data = row['Raters'] # get the list of lists in the 'Data' field
    users = row['Items'] # get the list of users with data
    
    # check if there are exactly three non-null columns with lists of equal length
    if len(data) == 3 and len(set([len(d) for d in data])) == 1:
        kappa = fleiss_kappa(aggregate_raters(np.array(data).T)[0])
        print(kappa)
        kappas.append({
            'Query': idx,
            'Items': users,
            'Fleiss Kappa': kappa
        })

# create a new dataframe from the list of Fleiss' Kappa values
kappa_df = pd.DataFrame(kappas)

AssertionError Fleiss' Kappa for IAA

1 Answers1