3

I've generated a Pandas dataframe from an nested dictionary, which looks like this:

                Col1      Col2      Col3
XXX   aaa         1         8         1
      bbb         9         7         4
      ccc         3         5         9
      ddd         2         7         2

YYY   aaa         8         7         6
      bbb         1         7         3
      ccc         2         4         5
      ddd         2         1         1

ZZZ   aaa         2         2         4
      bbb         1         7         5
      ccc         0         1         2
      ddd         0         9         6

I'm now struggling with following tasks:

1.getting all data, where the second index = "bbb" including the first index value, like in this example:

               Col1      Col2      Col3
XXX   bbb         1         8         1
YYY   bbb         8         7         6
ZZZ   bbb         2         2         4  

2.grouping the dataframe in several dataframes grouped by the first Index Values ("XXX", "YYY", ... )

Browsing through the Pandas doc and related stackoverflow questions didn' bring me further ...

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Into Numbers
  • 923
  • 11
  • 19
  • THX for all the quick answers! I've accepted piRSquared's answer, because his method also solved my "group by" problem in a way (without level=0 index) which was best for me via df.loc[pd.IndexSlice['XXX'], :] – Into Numbers May 06 '16 at 19:27

3 Answers3

3

In nested DataFrame is not index but MultiIndex.

import pandas as pd
df = pd.DataFrame({'Col2': {('ZZZ', 'ccc'): 1, ('ZZZ', 'aaa'): 2, ('ZZZ', 'ddd'): 9, ('XXX', 'aaa'): 8, ('XXX', 'ccc'): 5, ('YYY', 'bbb'): 7, ('XXX', 'ddd'): 7, ('ZZZ', 'bbb'): 7, ('YYY', 'ddd'): 1, ('YYY', 'aaa'): 7, ('YYY', 'ccc'): 4, ('XXX', 'bbb'): 7}, 'Col3': {('ZZZ', 'ccc'): 2, ('ZZZ', 'aaa'): 4, ('ZZZ', 'ddd'): 6, ('XXX', 'aaa'): 1, ('XXX', 'ccc'): 9, ('YYY', 'bbb'): 3, ('XXX', 'ddd'): 2, ('ZZZ', 'bbb'): 5, ('YYY', 'ddd'): 1, ('YYY', 'aaa'): 6, ('YYY', 'ccc'): 5, ('XXX', 'bbb'): 4}, 'Col1': {('ZZZ', 'ccc'): 0, ('ZZZ', 'aaa'): 2, ('ZZZ', 'ddd'): 0, ('XXX', 'aaa'): 1, ('XXX', 'ccc'): 3, ('YYY', 'bbb'): 1, ('XXX', 'ddd'): 2, ('ZZZ', 'bbb'): 1, ('YYY', 'ddd'): 2, ('YYY', 'aaa'): 8, ('YYY', 'ccc'): 2, ('XXX', 'bbb'): 9}})
print df
         Col1  Col2  Col3
XXX aaa     1     8     1
    bbb     9     7     4
    ccc     3     5     9
    ddd     2     7     2
YYY aaa     8     7     6
    bbb     1     7     3
    ccc     2     4     5
    ddd     2     1     1
ZZZ aaa     2     2     4
    bbb     1     7     5
    ccc     0     1     2
    ddd     0     9     6

print df.index
MultiIndex(levels=[[u'XXX', u'YYY', u'ZZZ'], 
                   [u'aaa', u'bbb', u'ccc', u'ddd']],
           labels=[[0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], 
                   [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3]])

I think you can use for selecting function xs:

print df.xs('aaa', level=1)
     Col1  Col2  Col3
XXX     1     8     1
YYY     8     7     6
ZZZ     2     2     4

print df.xs('bbb', level=1)
     Col1  Col2  Col3
XXX     9     7     4
YYY     1     7     3
ZZZ     1     7     5

Multiindex in docs is here.

Selecting:
cross section
using slicers - other solution

And groupby by first level of Multiindex with aggregating e.g. mean:

print df.groupby(level=0).mean()
     Col1  Col2  Col3
XXX  3.75  6.75  4.00
YYY  3.25  4.75  3.75
ZZZ  0.75  4.75  4.25

Groupby with multiindex

Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Solution

import pandas as pd

df.loc[pd.IndexSlice[:, 'bbb'], :]
piRSquared
  • 285,575
  • 57
  • 475
  • 624
1

Try get_level_values as in the following:

df.iloc[df.index.get_level_values(1) == 'bbb']

You can use df.index.get_level_values to expose the index levels and labels, and then construct your .iloc[] accordingly.

<bound method MultiIndex.get_level_values of MultiIndex ( levels=[[u'bar', u'baz', u'foo', u'qux'] , [u'one', u'two']], labels=[[0, 0, 1, 1, 2, 2, 3, 3] , [0, 1, 0, 1, 0, 1, 0, 1]] )>

Here is a useful reference:

selecting from multi-index pandas

Community
  • 1
  • 1
jdg
  • 547
  • 6
  • 15