I have a csv file with 3 columns. users, text and labels. each user has multiple texts and labels. i want to know the label with the highest frequency of occurrence in order to determine the category of each user.
I have tried:
for i in df['user'].unique():
print (df['class'].value_counts())
which gives returns the same values shown below for all users
4 3062
1 1250
0 393
3 281
2 13
Name: class, dtype: int64
I also tried
for h in df['user'].unique():
g = Counter(df['class'])
print (g)
and got
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
Counter({4: 3062, 1: 1250, 0: 393, 3: 281, 2: 13})
here is the sample data sample data Please Help