Pythonic way to regroup a pandas dataframe using max of a column

Question

I have the following data frame that has been obtained by applying df.groupby(['category', 'unit_quantity']).count()

category	unit_quantity	Count
banana	1EA	5
eggs	100G	22
	100ML	1
full cream milk	100G	5
	100ML	1
	1L	38

Let's call this latter dataframe as grouped. I want to find a way to regroup using columns unit_quantity and Count it and get

category	unit_quantity	Count	Most Frequent unit_quantity
banana	1EA	5	1EA
eggs	100G	22	100G
	100ML	1	100G
full cream milk	100G	5	1L
	100ML	1	1L
	1L	38	1L

Now, I tried to apply grouped.groupby(level=1).max() which gives me

unit_quantity
100G	22
100ML	1
1EA	5
1L	38

Now, because the indices of the latter and grouped do not coincide, I cannot join it using .merge. Does someone know how to solve this issue?

Thanks in advance

May be good to have the text data to be posted rather jpg/png which will be easy to reproduce and test for any further answers on this. — Karn Kumar, Jul 08 '21 at 05:24
Oh weird... I didn't post any images. It should be only text. — ℂybernetician, Jul 08 '21 at 05:28
OK, closed by dupe. Because accepting means need this solution. — jezrael, Jul 08 '21 at 06:05

score 1 · Accepted Answer · answered Jul 08 '21 at 06:02

Starting from your DataFrame :

>>> import pandas as pd

>>> df = pd.DataFrame({'category': ['banana', 'eggs', 'eggs', 'full cream milk', 'full cream milk', 'full cream milk'], 
...                    'unit_quantity': ['1EA', '100G', '100ML', '100G', '100ML', '1L'], 
...                    'Count': [5, 22, 1, 5, 1, 38],}, 
...                   index = [0, 1, 2, 3, 4, 5]) 
>>> df
    category    unit_quantity   Count
0   banana                1EA       5
1   eggs                 100G      22
2   eggs                100ML       1
3   full cream milk      100G       5
4   full cream milk     100ML       1
5   full cream milk        1L      38

You can use the transform method applied on max of the column Count in order to keep your category and unit_quantity values :

>>> idx = df.groupby(['unit_quantity'])['Count'].transform(max) == df['Count']
>>> df[idx]
    category    unit_quantity   Count
0   banana                1EA       5
1   eggs                 100G      22
2   eggs                100ML       1
4   full cream milk     100ML       1
5   full cream milk        1L      38

Pythonic way to regroup a pandas dataframe using max of a column

1 Answers1