Reduces dataframe based on maximum and duplicate rows

Question

I would like to reduce this dataframe with multiple rows with the same name, but with different values to a dataframe containing only 1 row per name with the maximum value.

df = pd.DataFrame([{"Name": "Foo", "Value": 1}, {"Name": "Foo", "Value": 3}, {"Name": "Bar", "Value": 1},{"Name": "Bar", "Value": 4}])

Dataframe wanted:

df_filtered = pd.DataFrame([{"Name": "Foo", "Value": 3}, {"Name": "Bar", "Value": 4}])

I started with df.groupby however it didn't feel natural.

score 1 · Accepted Answer · answered Sep 24 '21 at 08:31

1

Sort values to have the max values at the end, then take the last row per group.

df.sort_values(by='Value').groupby('Name', as_index=False).last()

answered Sep 24 '21 at 08:31

mozway

194,879
13
39
75

Thanks a lot. I also tried ```new_df = pd.DataFrame([{"Name": name, "Value": name_df["Value"].max()} for name, name_df in df.groupby("Name", axis=0)])``` However this feels not really scalable with a lot of columns – Johan Tuls Sep 24 '21 at 08:35

Reduces dataframe based on maximum and duplicate rows

1 Answers1