0

I would like to reduce this dataframe with multiple rows with the same name, but with different values to a dataframe containing only 1 row per name with the maximum value.

df = pd.DataFrame([{"Name": "Foo", "Value": 1}, {"Name": "Foo", "Value": 3}, {"Name": "Bar", "Value": 1},{"Name": "Bar", "Value": 4}])

Dataframe wanted:

df_filtered = pd.DataFrame([{"Name": "Foo", "Value": 3}, {"Name": "Bar", "Value": 4}])

I started with df.groupby however it didn't feel natural.

Johan Tuls
  • 105
  • 7

1 Answers1

1

Sort values to have the max values at the end, then take the last row per group.

df.sort_values(by='Value').groupby('Name', as_index=False).last()
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Thanks a lot. I also tried ```new_df = pd.DataFrame([{"Name": name, "Value": name_df["Value"].max()} for name, name_df in df.groupby("Name", axis=0)])``` However this feels not really scalable with a lot of columns – Johan Tuls Sep 24 '21 at 08:35