I'm trying to figure out how to block out null responses from a selection, and was wondering how to formulate the where statement such that it produces the correct selection. For instance, let's say I have the following code:
df = pd.DataFrame({'A' : ['foo','foo','bar','bar','baz'],
'B' : [1,2,1,2,np.nan],
'C' : np.random.randn(5) })
df.to_hdf('test.h5', 'df', mode='w', format='table', data_columns=True)
pd.read_hdf('test.h5', 'df')
A B C
0 foo 1 -0.046065
1 foo 2 -0.987685
2 bar 1 -0.110967
3 bar 2 -1.989150
4 baz NaN 0.126864
I essentially want the equivalent of saying:
pd.read_hdf('test.h5', 'df', where='B is not null')
How can I go about doing that?
Thanks!