4

I am interested in the first group in level 2 and want to get all the rows related to it.

Take a look at the example below:

col1  col2  col3   col4
 1     34   green   10
            yellow  20
            orange  30
       89   green   40
            yellow  50
            orange  60
 2     89   green   15
            yellow  25
            orange  35
       90   green   45
            yellow  55
            orange  65

Please note that the length of row for each level 2 group is not definitely 3.

Now I want to get all the first group under col2, then result is supposed to be:

col1  col2  col3   col4
 1     34   green   10
            yellow  20
            orange  30
 2     89   green   15
            yellow  25
            orange  35

The example and problem are modified from the question: How to get the first group in a groupby of multiple columns?

I have tried the get_group method but it seems not able to address this specific question.

I am wondering if there is any one-line code could solve this kind of question? Thx!

D OG
  • 43
  • 3

2 Answers2

3

There's a quick stack/unstack solution:

df.unstack('col3').groupby(level=0).head(1).stack('col3')

Output:

                col4
col1 col2 col3      
1    34   g       10
          o       30
          y       20
2    89   g       15
          o       35
          y       25
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
2

We can do

df.groupby(level=[0,2]).head(1)
Out[342]: 
                  col4
col1 col2 col3        
1    34   green     10
          yellow    20
          orange    30
2    89   green     15
          yellow    25
          orange    35
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thank you YOBEN_S! I think your method works clear and efficient for the question. Now I am wondering if it is possible to make some extensions to get the 2nd or other ordinal groups? – D OG May 27 '20 at 15:45
  • @DOG df[df.groupby(level=[0,2]).cumcount()==N], N here is the nth value you need – BENY May 27 '20 at 15:57