-2

I have the following DataFrame:

enter image description here

I need to create a new DataFrame from this one, which only will include the maximum value from the col3 for each unique pair of col1 and col2 values. The end result needs to look as the following:

enter image description here

For example, there are 2 rows where col1 is a and col2 is c. In one column, col3 is 1 and in the other one col3 is 2. And I only need to include the row where col3 is 2 because 2 is the max value of col3 where col1 is a and col2 is c.

What would be the most elegant way of doing this without making it too complex?

edn
  • 1,981
  • 3
  • 26
  • 56
  • 2
    hi there 1k rep user can you add a [mcve] please also see [ask] – Umar.H Aug 07 '20 at 15:20
  • 2
    Please provide a small set of sample data **as text** that we can copy and paste. Include the corresponding desired result. Check out the guide on [how to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/3620003). – timgeb Aug 07 '20 at 15:20

1 Answers1

0
df.groupby(['col1','col2'],as_index=False)['col3'].max()
sangmoo
  • 87
  • 1
  • 3
  • this is REALLY elegant! What if there was a 4th column and I would like to save the value of that column as well, whatever the value happens to be on that row? Your solution leaves the potential 4th column outside. – edn Aug 07 '20 at 15:32