How to get the first group under certain level of a groupby of multiple columns?

Question

I am interested in the first group in level 2 and want to get all the rows related to it.

Take a look at the example below:

col1  col2  col3   col4
 1     34   green   10
            yellow  20
            orange  30
       89   green   40
            yellow  50
            orange  60
 2     89   green   15
            yellow  25
            orange  35
       90   green   45
            yellow  55
            orange  65

Please note that the length of row for each level 2 group is not definitely 3.

Now I want to get all the first group under col2, then result is supposed to be:

col1  col2  col3   col4
 1     34   green   10
            yellow  20
            orange  30
 2     89   green   15
            yellow  25
            orange  35

The example and problem are modified from the question: How to get the first group in a groupby of multiple columns?

I have tried the get_group method but it seems not able to address this specific question.

I am wondering if there is any one-line code could solve this kind of question? Thx!

score 3 · Answer 1 · answered May 27 '20 at 13:11

There's a quick stack/unstack solution:

df.unstack('col3').groupby(level=0).head(1).stack('col3')

Output:

                col4
col1 col2 col3      
1    34   g       10
          o       30
          y       20
2    89   g       15
          o       35
          y       25

score 2 · Accepted Answer · answered May 27 '20 at 13:16

2

We can do

df.groupby(level=[0,2]).head(1)
Out[342]: 
                  col4
col1 col2 col3        
1    34   green     10
          yellow    20
          orange    30
2    89   green     15
          yellow    25
          orange    35

answered May 27 '20 at 13:16

BENY

317,841
20
164
234

Thank you YOBEN_S! I think your method works clear and efficient for the question. Now I am wondering if it is possible to make some extensions to get the 2nd or other ordinal groups? – D OG May 27 '20 at 15:45
@DOG df[df.groupby(level=[0,2]).cumcount()==N], N here is the nth value you need – BENY May 27 '20 at 15:57

How to get the first group under certain level of a groupby of multiple columns?

2 Answers2