7

I have a dataframe in python (many rows, 2 columns). I want to modify the DF with a unique value in column 1 based on the largest value in column 2 (column 2 is sorted in ascending order if that helps). I could probably write a loop but would prefer a one or two line solution. Thanks.

Ex.

ID         Value
100       11
100       14
100       16
200       10
200       20
200       30
300       45
400        0
400       25

desired result

100       16
200       30
300       45
400       25
EdChum
  • 376,765
  • 198
  • 813
  • 562
jim g
  • 71
  • 1
  • 2

3 Answers3

11

You want to groupby on 'a' column and then get the index of the max value using idxmax and use these indices to index the orig df:

In [12]:
df.loc[df.groupby('a')['b'].idxmax()]

Out[12]:
     a   b
2  100  16
5  200  30
6  300  45
8  400  25
EdChum
  • 376,765
  • 198
  • 813
  • 562
2

In case you don't need the original index but just the highest values per ID, you can use groupby and max:

print(df.groupby("ID").max())

     Value
ID  
100     16
200     30
300     45
400     25
pansen
  • 6,433
  • 4
  • 19
  • 32
0

input data

date = {'ID':[100,100,100,200,200,200,300,400,400],'Value':[11,14,16,10,20,30,45,0,25]}      
df= pd.DataFrame.from_dict(dane)    

ex:

    ID  Value
0  100     11
1  100     14
2  100     16
3  200     10
4  200     20
5  200     30
6  300     45
7  400      0
8  400     25

and solutions:

df = df.groupby('ID')[['Value']].max().reset_index()

output:

    ID  Value
0  100     16
1  200     30
2  300     45
3  400     25
Grzesik
  • 111
  • 2
  • 3