5

Say I have two dataframes,

import pandas as pd
df1 = pd.DataFrame({'col1':[0,2,3,2],'col2':[1,0,0,1]})
df2 = pd.DataFrame({'col12':[0,1,2,1],'col22':[1,1,1,1]})

Now df1.to_hdf('nameoffile.h5', 'key_to_store','w',table=True) successully stores df1 but I want to also store df2 to the same file, but If I try the same method then df1 will just be over written. When I try to load it and I check the keys I only see the info of df2. How can I store both df1 and df2 in the same h5 file as a table ?

Grr
  • 15,553
  • 7
  • 65
  • 85
atomsmasher
  • 725
  • 8
  • 20
  • 1
    by default `to_hdf` appends so if you remove `'w'` it should append `df2.to_hdf('nameoffile.h5', 'key_to_store'',table=True)` see the [docs](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_hdf.html#pandas.DataFrame.to_hdf) – EdChum Jul 08 '16 at 14:03
  • great! works like magic, ty! – atomsmasher Jul 08 '16 at 14:06

2 Answers2

6

You are using 'w' which overwrites, by default the mode is 'a' so you can just do:

df2.to_hdf('nameoffile.h5', 'key_to_store', table=True, mode='a')

Check the docs: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_hdf.html#pandas.DataFrame.to_hdf

Saber
  • 194
  • 2
  • 8
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • I was wondering if you have any insights to share about this: http://stackoverflow.com/questions/38283279/combining-huge-h5-files-with-multiple-datasets-into-one-with-odo – atomsmasher Jul 09 '16 at 20:06
  • In newer versions: `df2.to_hdf('nameoffile.h5', 'key_to_store', format="table", mode='a')` – felice Dec 05 '22 at 14:54
1

I have used this in the past without issue:

store = pd.HDFStore(path_to_hdf)
store[new_df_name] = df2
store.close()

So in your case you could try:

store = pd.HDFStore(path_to_hdf)
store['df1'] = df1
store['df2'] = df2
store.close()

I used this in a system where a user could store layouts for microtiter plate experiments. The first time they saved a layout the hdf file was created and subsequent layouts could then be appended to the file.

N.B. I have set pd.set_option('io.hdf.default.format', 'table') at the beginning of my program.

Grr
  • 15,553
  • 7
  • 65
  • 85