After performing group-by, my new df has 3 level multindex. I need to access all rows with 'ZEBRA' labels; which is contained in the 3rd level index. I'm trying to use df.loc
but unable to do so. I thought of iterating through the labels, but that will have to be a nested loop to make below; which makes me feel I'm not thinking along the right lines, there must be a much easier.
> indexlevel1_value1->indexlevel2_value1>indexlevel3_'stabilizer'
> indexlevel1_value1->indexlevel2_value2>indexlevel3_'stabilizer'
> indexlevel1_value1->indexlevel2_value3>indexlevel3_'stabilizer'
> ...................
> indexlevel2_value1->indexlevel2_value1>indexlevel3_'stabilizer'
This question looks close - Selecting rows in a MultiIndex dataframe by index without losing any levels but focused on first level index.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar',
'foo', 'bar', 'foo', 'foo',
'bar', 'foo', 'bar','foo',
'bar','foo' ],
'B' : ['one', 'one', 'two', 'three',
'two', 'two', 'one', 'three',
'two', 'three','two', 'two',
'one', 'three'],
'C' : ['MR', 'ZEBRA', 'KID', 'ZEBRA',
'MOS', 'ALPHA', 'ZULU', 'ZEBRA',
'TREE','PLANT', 'JOOMLA','ZEBRA',
'MOS','ZULU'],
'D' : np.random.randn(14)})
grouped = df.groupby(['A', 'B','C'])
grouped.count()
| A | B | C | D |
|-----|-------|--------|---|
| bar | one | MOS | 1 |
| | | ZEBRA | 1 |
| | three | ZEBRA | 1 |
| | two | ALPHA | 1 |
| | | JOOMLA | 1 |
| | | TREE | 1 |
| foo | one | MR | 1 |
| | | ZULU | 1 |
| | three | PLANT | 1 |
| | | ZEBRA | 1 |
| | | ZULU | 1 |
| | two | KID | 1 |
| | | MOS | 1 |
| | | ZEBRA | 1 |
newdf= grouped.count()
newdf.loc[('bar','three','ZEBRA')]
#1
Desired:
| A | B | C | D |
|-----|-------|-------|---|
| bar | one | ZEBRA | 1 |
| bar | three | ZEBRA | 1 |
| foo | three | ZEBRA | 1 |
| foo | two | ZEBRA | 1 |