0

I have a problem on removing a specified column of a multiindex dataframe.

Suppose we have this df:

df.columns
MultiIndex([('Status', 'Group A', 'PASS'),
            ('Status', 'Group A', 'Not PASS'),
            ('Status', 'Group A', 'Absent'),
            ('Status', 'Group B', 'PASS'),
            ('Status', 'Group B', 'Not PASS'),
            ('Status', 'Group B', 'Absent'),
            ('Status', 'Group B', nan),
            ('Status', 'Group C', 'PASS'),
            ('Status', 'Group C', 'Not PASS'),
            ('Status', 'Group C', 'Absent')],
           names=[None, 'Group', 'Status'])

And we want to create a new dataframe based on df and only want to extract ['Not Pass', 'Absent'] only.

Is there any way to slice/remove the unnecessary columns on the dataframe ?

AMR
  • 85
  • 5
  • Does this answer your question? [How to filter Pandas dataframe using 'in' and 'not in' like in SQL](https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql) – assaf.b Apr 20 '20 at 11:55

1 Answers1

0

Use slicing for exctract levels, first : mean all rows, first, second : in idx means all values in first, second level and list is for filtering:

idx = pd.IndexSlice
df = df.loc[:, idx[:, :, ['Not Pass', 'Absent']]]

Alternative:

df = df.loc[:, df.columns.get_level_values(2).isin(['Not Pass', 'Absent'])]

If want remove columns only inverse mask by ~:

df = df.loc[:, ~df.columns.get_level_values(2).isin(['Not Pass', 'Absent'])]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252