Selecting a column with multiple levels of column naming

Question

How do I select a particular column from a DataFrame when there are multiple levels of naming?

>>>  x = pd.DataFrame({'instance':['first','first','first'],'foo':['a','b','c'],'bar':rand(3)})
>>> x = x.set_index(['instance','foo']).transpose()
>>> x.columns
MultiIndex
[(u'first', u'a'), (u'first', u'b'), (u'first', u'c')]
>>> x
instance     first                    
foo              a         b         c
bar       0.102885  0.937838  0.907467

(Note: this question is asked in the comments to this SO question and there is an answer also in the comments. Thought it'd be good to have it as a question in its own right.)

score 1 · Accepted Answer · answered Aug 19 '14 at 16:46

1

This is exactly the purpose of Multiindex slicers, see docs here

In [15]: idx = pd.IndexSlice

In [16]: x.loc[:,idx[:,'a']]
Out[16]: 
instance     first
foo              a
bar       0.525356

In [17]: x.loc[:,idx[:,['a','c']]]
Out[17]: 
instance     first          
foo              a         c
bar       0.525356  0.418152

answered Aug 19 '14 at 16:46

Jeff

125,376
21
220
187

1

Is there anyone in the world who can actually understand this? I've read the docs and they boggle my mind. – LondonRob Aug 20 '14 at 11:22
Also: is there a way to do this in `0.13.1`? – LondonRob Aug 20 '14 at 11:23
in older versions you can use ``.xs`` (for a single label but on on any level/axis), or ``x.loc[:,x.columns.get_level_values(1).isin(['a','c])]`` for multiple – Jeff Aug 20 '14 at 11:58

Selecting a column with multiple levels of column naming

1 Answers1