Python Dataframe select rows based on max values in one of the columns

Question

I have a dataframe in python (many rows, 2 columns). I want to modify the DF with a unique value in column 1 based on the largest value in column 2 (column 2 is sorted in ascending order if that helps). I could probably write a loop but would prefer a one or two line solution. Thanks.

Ex.

ID         Value
100       11
100       14
100       16
200       10
200       20
200       30
300       45
400        0
400       25

desired result

100       16
200       30
300       45
400       25

score 11 · Answer 1 · answered Mar 28 '17 at 13:24

11

You want to groupby on 'a' column and then get the index of the max value using idxmax and use these indices to index the orig df:

In [12]:
df.loc[df.groupby('a')['b'].idxmax()]

Out[12]:
     a   b
2  100  16
5  200  30
6  300  45
8  400  25

answered Mar 28 '17 at 13:24

EdChum

376,765
198
813
562

score 2 · Answer 2 · answered Mar 28 '17 at 13:35

2

In case you don't need the original index but just the highest values per ID, you can use groupby and max:

print(df.groupby("ID").max())

     Value
ID  
100     16
200     30
300     45
400     25

answered Mar 28 '17 at 13:35

pansen

6,433
4
19
32

score 0 · Answer 3 · answered Nov 01 '20 at 15:15

input data

date = {'ID':[100,100,100,200,200,200,300,400,400],'Value':[11,14,16,10,20,30,45,0,25]}      
df= pd.DataFrame.from_dict(dane)

ex:

    ID  Value
0  100     11
1  100     14
2  100     16
3  200     10
4  200     20
5  200     30
6  300     45
7  400      0
8  400     25

and solutions:

df = df.groupby('ID')[['Value']].max().reset_index()

output:

    ID  Value
0  100     16
1  200     30
2  300     45
3  400     25

Python Dataframe select rows based on max values in one of the columns

3 Answers3

Linked