I'm having an issue with a pandas dataframe. I have a dataframe with three columns , the first 2 are identifiers (str), and the third is a number.
I would like to group it so that i get the first column the third as a max, and the second column which index corresponding to the third.
That's not quite clear so let's give an example. My dataframe looks like:
id1 id2 amount
0 first_person first_category 18
1 first_person second_category 37
2 second_person first_category 229
3 second_person third_category 23
The code for it if you need:
df = pd.DataFrame([['first_person','first_category',18],['first_person','second_category',37],['second_person','first_category',229],['second_person','third_category',23]],columns = ['id1','id2','amount'])
And I would like to get:
id1 id2 amount
0 first_person second_category 37
1 second_person third_category 229
I have tried a groupby method, but it makes me loose the second column:
result = df.groupby(['id1'],as_index=False).agg({'amount':np.max})