I have a dataframe df as following:
userId pageId tag
0 3122471 e852 18
1 3122471 f3e2 18
2 3122471 7e93 18
3 3122471 2768 6
4 3122471 53d9 6
5 3122471 06d7 15
6 3122471 e31c 15
7 3122471 c6f3 2
8 1234123 fjwe 1
9 1234123 eiae 4
10 1234123 ieha 4
After using df.groupby(['userId', 'tag'])['pageId'].count()
to group the data by userId and tag .
I will get:
userId tag
3122471 2 1
6 2
15 2
18 3
1234123 1 1
4 2
Now I want to find the tag that each user has the most. Just as following:
userId tag
3122471 18
1234123 4
(Note: if there are multiple tags that has the same count, I want to use a function my_rule
to determine which to show)