Pandas: Select all rows with maximum column values

Asked Mar 04 '21 at 19:22

Active Mar 04 '21 at 19:32

Viewed 39 times

I have the following dataframe-

my_df-

user_id |  spend |  transaction_id |
--------+--------+-----------------|
1       |   45   |        12       |
2       |   33   |        45       |
3       |   12   |        33       |
1       |   22   |        56       |
1       |   77   |        99       |
2       |   44   |        68       |

My goal is to get all rows with the greatest transaction_id for each user_id.

So, I want my final result to look like this -

user_id |  spend |  transaction_id |
--------+--------+-----------------|
1       |   77   |        99       |
2       |   44   |        68       |
3       |   12   |        33       |

How do I do this?

asked Mar 04 '21 at 19:22

kev

2,741
5
22
48

1

`df.sort_values(['user_id','transaction_id']).drop_duplicates('user_id', keep='last')` – Quang Hoang Mar 04 '21 at 19:25
What have you tried so far based on your own research? For example, groupby() with `.max()` might work for you – G. Anderson Mar 04 '21 at 19:25
@G.Anderson groupby() will cause it do drop the `spend` column but I want to retain it without any aggregation on it. @Quang Hoang's answer works perfect – kev Mar 04 '21 at 19:46

Pandas: Select all rows with maximum column values

0 Answers0