1

I have the following dataframe:

             1   2   3   4   5   6   7   8   9   10
cat   cat     1   1   1   1   1   1   0   0   1   1
      dog     1   0   1   1   1   1   0   0   1   1
      fox     0   0   0   0   0   0   0   0   0   0
      jumps   1   0   1   1   1   0   0   1   1   1
      over    1   0   1   1   1   1   0   0   1   1
      the     1   0   1   1   1   1   0   0   1   1
dog   cat     1   1   0   1   1   1   0   0   1   0
      dog     1   1   1   1   1   1   0   0   1   1
      fox     1   1   1   1   1   1   0   0   1   1
      jumps   1   1   1   1   1   1   0   1   1   1
      over    1   1   1   1   1   1   0   0   1   1
      the     1   1   1   1   1   1   1   0   1   1
fox   cat     0   0   0   0   0   0   0   0   0   0
      dog     1   1   1   1   1   1   0   0   1   1
      fox     1   1   1   1   1   1   0   0   1   1
      jumps   1   1   1   1   1   1   0   1   1   1
      over    1   1   1   1   1   1   0   0   1   1
      the     1   1   1   1   1   1   1   0   1   1
jumps cat     1   1   0   1   0   1   1   0   1   0
      dog     1   1   1   1   1   1   1   0   1   0
      fox     1   1   1   1   1   1   1   0   1   0
      jumps   1   1   1   1   1   1   0   0   1   0
      over    1   0   1   1   1   0   0   1   1   0
      the     1   0   1   1   1   1   0   0   1   0
over  cat     1   1   0   1   1   1   0   0   1   0
      dog     1   1   1   1   1   1   0   0   1   0
      fox     1   1   1   1   1   1   0   0   1   0
      jumps   1   1   0   1   0   1   1   0   1   0
      over    1   1   1   1   1   1   0   0   1   0
      the     1   0   1   1   1   0   0   1   1   0
the   cat     1   1   0   1   1   1   0   0   1   0
      dog     1   1   1   1   1   1   0   1   1   0
      fox     1   1   1   1   1   1   0   1   1   0
      jumps   1   1   0   1   1   1   0   0   1   0
      over    1   1   0   1   0   1   1   0   1   0
      the     1   1   1   1   1   1   0   0   1   0

As you can see the first two columns are unlabelled. I want to select all rows where column1 == 'dog'

So that I end up with this:

dog   cat     1   1   0   1   1   1   0   0   1   0
dog   dog     1   1   1   1   1   1   0   0   1   1
dog   fox     1   1   1   1   1   1   0   0   1   1
dog   jumps   1   1   1   1   1   1   0   1   1   1
dog   over    1   1   1   1   1   1   0   0   1   1
dog   the     1   1   1   1   1   1   1   0   1   1

If it had a label, the solution would have been:

print(df.loc[df['label'] == 'dog'])

But because it doesn't have a label, how do I achieve that? Any suggestions will be highly appreciated. Thank you.

sshussain270
  • 1,785
  • 4
  • 25
  • 49

1 Answers1

1

What you want to do is to use double brackets like this:

df.loc[['dog']]

Output:

           1  2  3  4  5  6  7  8  9  10
dog cat    1  1  0  1  1  1  0  0  1   0
    dog    1  1  1  1  1  1  0  0  1   1
    fox    1  1  1  1  1  1  0  0  1   1
    jumps  1  1  1  1  1  1  0  1  1   1
    over   1  1  1  1  1  1  0  0  1   1
    the    1  1  1  1  1  1  1  0  1   1

Then you can reset_index:

df.loc[['dog']].reset_index()

Output:

  level_0 level_1  1  2  3  4  5  6  7  8  9  10
0     dog     cat  1  1  0  1  1  1  0  0  1   0
1     dog     dog  1  1  1  1  1  1  0  0  1   1
2     dog     fox  1  1  1  1  1  1  0  0  1   1
3     dog   jumps  1  1  1  1  1  1  0  1  1   1
4     dog    over  1  1  1  1  1  1  0  0  1   1
5     dog     the  1  1  1  1  1  1  1  0  1   1

Pandas as pretty good docs on MultiIndex

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
  • 1
    @MaxU I used your [read_clipboard_mi function](https://stackoverflow.com/a/45741989/6361531) One of the best SO answers of all time! – Scott Boston Nov 21 '17 at 22:52