printing the top 2 of frequently occurred values of the target column

Question

I have three columns like shown below, and trying to return top1 and top2 highest count of the third column. I want this output to be generated as shown in the expected output . DATA :

print (df)

   AGE GENDER rating
0   10      M     PG
1   10      M      R
2   10      M      R
3    4      F   PG13
4    4      F   PG13

CODE :

 s = (df.groupby(['AGE', 'GENDER'])['rating']
       .apply(lambda x: x.value_counts().head(2))
       .rename_axis(('a','b', 'c'))
       .reset_index(level=2)['c'])

output :

print (s)

a   b
4   F    PG13
10  M       R
    M      PG
Name: c, dtype: object

EXPECTED OUTPUT :

print (s[F])
('PG13')

print(s[M])

('PG13', 'R')

and instead `('PG13', 'R')` is `('PG', 'R')` ? – jezrael Feb 13 '18 at 14:17 — jezrael, Feb 13 '18 at 14:17

score 1 · Accepted Answer · answered Feb 13 '18 at 14:17

1

I think you need:

s = (df.groupby(['AGE', 'GENDER'])['rating']
       .apply(lambda x: x.value_counts().head(2))
       .rename_axis(('a','b', 'c'))
       .reset_index()
       .groupby('b')['c']
       .apply(list)
       .to_dict()
       )
print (s)
{'M': ['R', 'PG'], 'F': ['PG13']}

answered Feb 13 '18 at 14:17

jezrael

822,522
95
1,334
1,252

Awesome it worked ...You saved me again . Is there any other way that I can greet you ?? I would do it :) Thanks alot. – pylearner Feb 13 '18 at 14:31
You are welcome! Btw, this [solution](https://stackoverflow.com/a/48724663/2901002) does not work? – jezrael Feb 13 '18 at 14:33
1

No jezz, I have 15 columns which I should do a group by and my sequence combination changes, also, there are null values which it should handle .. i had to ignore them and chnage my sequence again. so the above solution worked . I have created conditions and inserted your code there. – pylearner Feb 13 '18 at 14:38
how can I connect you through linked in or any social networking ? – pylearner Feb 14 '18 at 14:50
I dont use fb, nor something similar. But you can send me email, but not very often check it. ;) Email is in my profile. – jezrael Feb 14 '18 at 14:51

printing the top 2 of frequently occurred values of the target column

1 Answers1

Linked