In my data frame
data = {'Query' : ['Carpal Ts', 'Dermallxde'],
'Items' : [['Sir chucks', 'Oga Mathew', 'Tambe Charlse'],['Man Ond', 'kolofata Hil', 'Haruna mendy']],
'Raters' : [[[0,0,0,1,1,0,0,1,0,1,0,0,0],[0,1,0,1,0,0,0,1,1,1,0,0,1],[1,1,1,1,1,1,1,1,1,1,1,1,1]],[[1,1,1,1,1,1,1,1,1,1,1,1,1],[1,1,1,0,1,0,1,1,0,1,1,1,0],[1,1,1,1,1,1,1,1,1,1,1,1,1]]]}
results_df = pd.DataFrame(data)
The Raters rates either 0/1 for for each query rates Items as bad or good respectively.
I have tried to calculate the Fleiss' Kappa by creating an matrix array for the raters column and for each Item row i get the Fleiss' Kappa for that particular Query. Please see my attempt below
from statsmodels.stats.inter_rater import fleiss_kappa
# create an empty list to hold the Fleiss' Kappa values
kappas = []
# iterate over each index in the results dataframe
for idx, row in results_df.iterrows():
data = row['Raters'] # get the list of lists in the 'Data' field
users = row['Items'] # get the list of users with data
# check if there are exactly three non-null columns with lists of equal length
if len(data) == 3 and len(set([len(d) for d in data])) == 1:
kappa = fleiss_kappa(np.array(data).T)
print(kappa)
kappas.append({
'Query': idx,
'Items': users,
'Fleiss Kappa': kappa
})
# create a new dataframe from the list of Fleiss' Kappa values
kappa_df = pd.DataFrame(kappas)
Any help on this Calculation will be much appreciated.