How do I get the maximum and minimum values of a column depending on another two columns in pandas dataframe?

Question

This is my first time asking a question. I have a dataframe that looks like below:

import pandas as pd
data = [['AK', 'Co',2957],
    ['AK', 'Ot', 15],
    ['AK','Petr', 86848],
    ['AL', 'Co',167],
    ['AL', 'Ot', 10592],
    ['AL',  'Petr',1667]]
my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

I need to find the maximum and minimum values of the third column based on the first two columns. I did browse through a few stackoverflow questions but couldn't find the right way to solve this. My output should look like below:

data = [['AK','Ot', 15],
['AK','Petr',86848],
['AL','Co',167],
['AL','Ot', 10592]]

my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

Note: Please let me know where I am lagging before leaving a negative marking on the question

This link helped me: Python pandas dataframe: find max for each unique values of an another column

@DaniMesejo, thank you for teh reply. I am only trying to find the min value and max value of third column based on first and second column. For 'state' column 'AL', the min value is 167 and max value is 10592, which need to be displayed. — DeepMeh, Dec 03 '20 at 17:13
`df.groupby(['State','Energy'])['Elec'].agg(['min','max'])`? — Quang Hoang, Dec 03 '20 at 17:14
Does this answer your question? [Pandas dataframe max and min value](https://stackoverflow.com/questions/42612676/pandas-dataframe-max-and-min-value) — Joe Ferndz, Dec 03 '20 at 17:14
It doesn't, but I need it to be printed because I need it for further plotting purpose — DeepMeh, Dec 03 '20 at 17:19
@JoeFerndz , it sort of did, but instead of printing the 'max; and 'min' next to every row, I want it to be filtered within the dataframe itself. This way, my number of rows would be reduced too. — DeepMeh, Dec 03 '20 at 17:20
This link helped me: https://stackoverflow.com/questions/40568438/python-pandas-dataframe-find-max-for-each-unique-values-of-an-another-column — DeepMeh, Dec 03 '20 at 17:41

score 1 · Answer 1 · answered Dec 03 '20 at 17:14

1

try idxmin and idxmax with .loc filter.

    new_df = my_df.loc[
        my_df.groupby(["State"])
        .agg(ElecMin=("Elec", "idxmin"), ElecMax=("Elec", "idxmax"))
        .stack()
    ]
)

print(new_df)

  State Energy   Elec
0    AK     Ot     15
1    AK   Petr  86848
2    AL     Co    167
3    AL     Ot  10592

answered Dec 03 '20 at 17:14

Umar.H

22,559
7
39
74

1

you can alternatively do `my_df.loc[my_df.groupby("State")['Elec'].agg(['idxmin', 'idxmax']).stack()]` ;) – Shubham Sharma Dec 03 '20 at 17:44
1

@ShubhamSharma thanks :) yeah that makes more sense and looks cleaner but i want to spread more awareness of the `.agg` with `NamedTuples` they are pretty awesome so i try to use them more in examples – Umar.H Dec 03 '20 at 17:57

How do I get the maximum and minimum values of a column depending on another two columns in pandas dataframe?

1 Answers1