12

I am trying to read a h5 file in Python.

The file can be found in this link and it is called 'vstoxx_data_31032014.h5'. The code I am trying to run is from the book Python for Finance, by Yves Hilpisch and goes like this:

import pandas as pd     
h5 = pd.HDFStore('path.../vstoxx_data_31032014.h5', 'r')
futures_data = h5['futures_data']  # VSTOXX futures data
options_data = h5['options_data']  # VSTOXX call option data
h5.close()

I am getting the following error:

h5 = pd.HDFStore('path.../vstoxx_data_31032014.h5', 'r')
Traceback (most recent call last):

  File "<ipython-input-692-dc4e79ec8f8b>", line 1, in <module>
    h5 = pd.HDFStore('path.../vstoxx_data_31032014.h5', 'r')

  File "C:\Users\Laura\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 466, in __init__
    self.open(mode=mode, **kwargs)

  File "C:\Users\Laura\Anaconda3\lib\site-packages\pandas\io\pytables.py", line 637, in open
    raise IOError(str(e))

OSError: HDF5 error back trace

  File "C:\aroot\work\hdf5-1.8.15-patch1\src\H5F.c", line 604, in H5Fopen
    unable to open file
  File "C:\aroot\work\hdf5-1.8.15-patch1\src\H5Fint.c", line 1085, in H5F_open
    unable to read superblock
  File "C:\aroot\work\hdf5-1.8.15-patch1\src\H5Fsuper.c", line 277, in H5F_super_read
    file signature not found

End of HDF5 error back trace

Unable to open/create file 'path.../vstoxx_data_31032014.h5'

where I have substituted my working directory for 'path.../' for the purpose of this question.

Does anyone know where this error might be coming from?

Framester
  • 33,341
  • 51
  • 130
  • 192
python_enthusiast
  • 896
  • 2
  • 7
  • 26

2 Answers2

15

In order to open a HDF5 file with the h5py module you can use h5py.File(filename). The documentation can be found here.

import h5py

filename = "vstoxx_data_31032014.h5"

h5 = h5py.File(filename,'r')

futures_data = h5['futures_data']  # VSTOXX futures data
options_data = h5['options_data']  # VSTOXX call option data

h5.close()
DavidG
  • 24,279
  • 14
  • 89
  • 82
-3
import os

wd=os.chdir('pah of your working directory') #change the file path to your working directory
wd=os.getcwd() #request what is the current working directory
print(wd)

if __name__ == '__main__':
    # import required libraries
    import h5py as h5
    import numpy as np
    import matplotlib.pyplot as plt

    f = h5.File("hdf5 file with its path", "r")
    datasetNames = [n for n in f.keys()]
    for n in datasetNames:
        print(n)
Benjamin Breton
  • 1,388
  • 1
  • 13
  • 42
  • 2
    Please don't post code-only answers. Future readers will be grateful to see explained *why* this answers the question instead of having to infer it from the code. Also, since this is an old question, please explain how it compliments the other, accepted answer. Finally, please find out how code formatting works. – Gert Arnold May 28 '22 at 12:28