7

My question is similar to How to check if a column exists in Pandas but for the multi-index column case.

I'm trying to process values in a multi index column dataframe using column names originating in another file - hence the need to check if the column exists. A representative example is below:

import pandas as pd
from numpy.random import randint,randn

df = pd.DataFrame({ 'A': [randint(0,3) for p in range(0,12)],'B': [0.1* randint(0,3) for p in range(0,12)],
      'C': [0.1*randint(0,3) for p in range(0,12)],'D': randn(12),
    })

df1 = df.groupby(['A','B','C']).D.sum().unstack(-1)
df1 = df1.T
df1
A           0                   1                             2          
B         0.0       0.2       0.0       0.1       0.2       0.0       0.1
C                                                                        
0.0       NaN       NaN       NaN  0.845316       NaN  0.555513       NaN
0.1       NaN  0.139371       NaN       NaN       NaN       NaN -0.260868
0.2  5.002509       NaN  0.637353  0.438863  0.943098       NaN       NaN

df1[1][0.1]
C
0.0    0.845316
0.1         NaN
0.2    0.438863

Accessing df1[0][0.1] in the above example will result in a key error. How do I check if a multi index column exists, so that non-existent columns can be skipped during processing?

Thanks!

Community
  • 1
  • 1
kip6000
  • 85
  • 1
  • 6

1 Answers1

12

You can think of a multi index like an array of tuples, so can access like:

df1[(0, 0.1)]

and test like:

(0, 0.1) in df1.columns:
Mr.F
  • 956
  • 1
  • 7
  • 10