I would like to select from a HUGE hdf5 a subset of the data, day by day. It would be perfect to use a where mask, but I can't make it work with a multiindex (since I have to have a where with two conditions). can't use a where mask with a multiindex:
import itertools
import pandas as pd
import numpy as np
a = ('A', 'B')
i = (0, 1, 2)
idx = pd.MultiIndex.from_tuples(list(itertools.product(a, i)),
names=('Alpha', 'Int'))
df = pd.DataFrame(np.random.randn(len(idx), 7), index=idx,
columns=('I', 'II', 'III', 'IV', 'V', 'VI', 'VII'))
Ok, now I put it in a hdf store
from pandas.io.pytables import HDFStore
store =HDFStore('cancella.h5', 'w')
store.append('df_mask',df)
But if I read it again I have
c = store.select_column('df_mask','index')
print c
this index which is WRONG.
0 0
1 1
2 2
3 3
4 4
5 5
dtype: int64
So I can't use the where mask. Can you help me?