pandas, how to access multiIndex dataframe?

Question

Show my code

>>> df = pd.DataFrame({'key1': ['a', 'a', 'b', 'b', 'a'], \
                   'key2': ['one', 'two', 'one', 'two', 'one'], \
                   'data1': np.random.randn(5), \
                   'data2': np.random.randn(5)})

>>> new_df = df.groupby(['key1', 'key2']).mean().unstack()
>>> print new_df
         data1               data2
key2       one       two       one       two
key1
a    -0.070742 -0.598649 -0.349283 -1.272043
b    -0.109347 -0.097627 -0.641455  1.135560 
>>> print new_df.columns
MultiIndex(levels=[[u'data1', u'data2'], [u'one', u'two']],
       labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
       names=[None, u'key2'])

As you can see, the MultiIndex dataframe is different with normal dataframes, so how to access the data in the MultiIndex dataframe.

Though it's not easy to follow the documentation (explanations buried into an ["advanced indexing"](https://pandas.pydata.org/pandas-docs/stable/user_guide/advanced.html#advanced-indexing-with-hierarchical-index) section), keep in mind multilevel indexing is based on tuple indices, hence accessing data requires `loc` and tuples, even if there are ambiguous shortcuts not using `loc` and even not using tuples. — mins, Jan 02 '21 at 16:15

score 25 · Accepted Answer · answered Apr 23 '16 at 05:53

Accessing data in multiindex dataframe is similar to the way on a general dataframe. For example, if you want to read data at (a, data1.two), you can simply do: new_df['data1']['two']['a'] or new_df.loc['a', ('data1', 'two')]

Please read the official docs for more details.

score -1 · Answer 2 · answered Oct 20 '21 at 04:16

-1

This might helps you to know and visualize

unstacked = multi_indexDataFrame.unstack().dropna()
unstacked.plot(kind="bar")

answered Oct 20 '21 at 04:16

sounish nath

567
4
3

pandas, how to access multiIndex dataframe?

2 Answers2

Linked