3

First of all many thanks for your help. I have a table that I import with pandas as df. For each unique foo&bar I want to obtain a new df with the minimum of zoo and the corresponding qux. I tried to simplify my dataframe, but in reality I have some hundreds of qux and foo and some dozens of bar.

my input table :

foo bar zoo qux
aaa HB1 9.75 lab1
aaa HB1 4.87 lab2
aaa HB1 3.05 lab3
aaa TS3 8.51 lab1
aaa TS3 2.58 lab2
aaa TS3 2.48 lab3
bbb HB1 9.03 lab1
bbb HB1 6.11 lab2
bbb HB1 7.66 lab3
bbb TS3 3.57 lab1
bbb TS3 4.25 lab2
bbb TS3 1.63 lab3

my expected result

foo bar zoo qux
aaa HB1 3.05 lab3
aaa TS3 2.48 lab3
bbb HB1 6.11 lab2
bbb TS3 1.63 lab3

I tried to use groupby or pivot_table, I obtained the min zoo for each bar and each foo but I didn't obtain the corresponding qux and the df was completely reshaped and didn't look like my first format. I'm litle bit lost.

Many Thanks in advance for your help.

mama_oooh
  • 33
  • 4

2 Answers2

2

You can sort the values in your dataframe in descending order and use groupby.tail(1):

df.sort_values(by=['foo','bar','zoo','qux'],ascending=False).groupby(['foo','bar']).tail(1)

    foo  bar   zoo   qux
2   aaa  HB1  3.05  lab3
5   aaa  TS3  2.48  lab3
8   bbb  HB1  7.66  lab3
11  bbb  TS3  1.63  lab3
sophocles
  • 13,593
  • 3
  • 14
  • 33
2

Get the index positions of the minimum via groupby, and index the original df to get the rows:

df.loc[df.groupby(['foo', 'bar']).zoo.idxmin()]
 
    foo  bar   zoo   qux
2   aaa  HB1  3.05  lab3
5   aaa  TS3  2.48  lab3
7   bbb  HB1  6.11  lab2
11  bbb  TS3  1.63  lab3
sammywemmy
  • 27,093
  • 4
  • 17
  • 31
  • 1
    Great...!!! thanks a lot. I ignored idxmin(). Well, I've learned something new today. Thanks Sammywemmy! – mama_oooh Dec 21 '21 at 13:24