I want to read and write data to hdf5 file incrementally because I can't fit the data into memory.
The data to read/write is sets of integers. I only need to read/write the sets sequentially. No need for random access. Like I read set1, then set2, then set3, etc.
The problem is that I can't retrieve the sets by index.
import pandas as pd
x = pd.HDFStore('test.hf', 'w', append=True)
a = pd.Series([1])
x.append('dframe', a, index=True)
b = pd.Series([10,2])
x.append('dframe', b, index=True)
x.close()
x = pd.HDFStore('test.hf', 'r')
print(x['dframe'])
y=x.select('dframe',start=0,stop=1)
print("selected:", y)
x.close()
Output:
0 1
0 10
1 2
dtype: int64
selected: 0 1
dtype: int64
It doesn't select my 0th set, which is {1,10}