4

I know similar questions have been asked but I can't seem to find a solution to this one.

With the following code I can filter out using the columns and the first index but not the second.

import pandas as pd
import numpy as np
ix = pd.MultiIndex.from_product([ ['foo', 'bar'], ['baz', 
'can']], names=['a', 'b'])
data = np.arange(len(ix))
df = pd.DataFrame(data, index=ix, columns=['values'])
df['values2']=[1,4,5,6]
print(df)

the resulting output is as follows:

enter image description here

Notice how the last line, does not work

df.loc['foo','can']['values2']   # works
df.loc['foo']['values2']         # works
df.loc['foo','can'][:]           # works
df.loc['foo',:][:]               # works
df.loc[:,'can'][:]               # does not work. 
Chris L
  • 319
  • 1
  • 3
  • 10
  • 2
    Bookmark @coldspeed's comprehensive multiindex indexing answer, it answers this and many other questions: https://stackoverflow.com/questions/53927460/select-rows-in-pandas-multiindex-dataframe – Josh Friedlander Apr 04 '19 at 12:53

1 Answers1

3

Use slicers for more complicated selections:

idx = pd.IndexSlice

print (df.loc[idx['foo', 'can'], 'values'])
1

print (df.loc[idx['foo'], 'values'])
b
baz    0
can    1
Name: values, dtype: int32

print (df.loc[idx['foo',:], 'values'])
a    b  
foo  baz    0
     can    1
Name: values, dtype: int32

print (df.loc[idx['foo','can'], :])
values     1
values2    4
Name: (foo, can), dtype: int64

print (df.loc[idx['foo',:], :])
         values  values2
a   b                   
foo baz       0        1
    can       1        4

print (df.loc[idx[:, 'can'], :])
         values  values2
a   b                   
foo can       1        4
bar can       3        6
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252