23

I created an HDF5 file apparently without any problems, under Ubuntu 12.04 (32bit version), using Anaconda as Python distribution and writing in ipython notebooks. The underlying data are all numpy arrays. For example,

import numpy as np
import h5py

f = h5py.File('myfile.hdf5','w')

group = f.create_group('a_group')

group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')

If I try to open this file from a new iypthon notebook, though, I get an error message:

f = h5py.File('myfile.hdf5', "r")

---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-4-b64ac5089cd4> in <module>()
----> 1 f = h5py.File(file_name, "r")

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/files.pyc in __init__(self, name, mode, driver, libver, userblock_size, **kwds)
    220 
    221             fapl = make_fapl(driver, libver, **kwds)
--> 222             fid = make_fid(name, mode, userblock_size, fapl)
    223 
    224         Group.__init__(self, fid)

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/files.pyc in make_fid(name, mode, userblock_size, fapl, fcpl)
     77 
     78     if mode == 'r':
---> 79         fid = h5f.open(name, h5f.ACC_RDONLY, fapl=fapl)
     80     elif mode == 'r+':
     81         fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/h5f.so in h5py.h5f.open (h5py/h5f.c:1741)()

IOError: Unable to open file (Unable to find a valid file signature)

Can you tell me what that missing file signature is? Did I miss something when I created the file?

Lilith-Elina
  • 1,613
  • 4
  • 20
  • 31
  • 6
    Did you `f.close()` your *writable* file before trying to open it again? Also, your example code is not executable: the variables `Mfrgroup`, `fgroup_ID`, `pos`, `Msgroup` `sgroup_ID` and `names` are not defined. – farenorth Oct 28 '14 at 16:47
  • 2
    It is often recommended when writing to or from files (using any file constructor, not just h5py) to use the [`with` statement](http://stackoverflow.com/questions/3012488/what-is-the-python-with-statement-designed-for). For example `with h5py.File('myfile.hdf5', 'w') as f:`. This makes it so that you don't have to explicitly close the file. On the other hand, it makes it difficult to debug file I/O interactively. – farenorth Oct 28 '14 at 16:53
  • 1
    Also, do you really need to create all of those variables in your dataset to generate this error? Do you get the same error if you create only one of those variables? – farenorth Oct 28 '14 at 17:01
  • 2
    You were right, I did forget to `f.close()` the file! I normally use the `with` statement, but this time followed a tutorial and of course forgot that... Do you want to write an answer for that, or should I? Or is there another way to mark this question as solved? – Lilith-Elina Oct 29 '14 at 07:45
  • I edited your Q to make it simpler. You can write an answer if you like, but I would also clarify in the question *exactly* how you got to this error. Your statement 'open this file form a new script' suggests that a new python interpreter is being used. This will probably only generate an error in Windows. If you are using Windows state that in the Q. Otherwise perhaps you are using one interpreter to both write and read the file? Perhaps using something like Python2's `execfile`? If you are using one interpreter you should state that in the Q. – farenorth Oct 29 '14 at 14:40
  • I edited the Q further to clarify that I'm using Ubuntu and different ipython notebooks. – Lilith-Elina Oct 29 '14 at 15:31
  • 1
    Great! I don't know much about iPython notebooks, but based on this error I am guessing that they use the same interpreter (kernel/session). [This page](http://nbviewer.ipython.org/github/Zsailer/multidir_ipynb_tutorial/blob/master/Multi-directory%20IPython%20notebook.ipynb) has a discussion of separating the kernel from the notebook, which is probably apropos to your issue; i.e. perhaps you are using iPy notebooks prior to this change? Those types of things could be documented in the answer you provide. – farenorth Oct 29 '14 at 16:15
  • Hm, no, actually each notebook should have its own kernel. Wouldn't that make more sense, anyway? If I try to open a file within the same notebook twice, I get an error about the close status of the file. – Lilith-Elina Nov 04 '14 at 08:42

2 Answers2

36

Since we resolved the issue in the comments on my question, I'm writing the results out here to mark it as solved.

The main problem was that I forgot to close the file after I created it. There would have been two simple options, either:

import numpy as np
import h5py

f = h5py.File('myfile.hdf5','w')
group = f.create_group('a_group')
group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')
f.close()

or, my favourite because the file is closed automatically:

import numpy as np
import h5py

with h5py.File('myfile.hdf5','w') as f:
    group = f.create_group('a_group')
    group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')
Lilith-Elina
  • 1,613
  • 4
  • 20
  • 31
  • 1
    For people who came here because of getting this error, I have one general advice: "Just check everything once again". In my case it was not having checked out the `.h5` file from LFS. (so it contained LFS metadata instead of HDF data). Nice to know about other sources of this problem. – Tomasz Gandor May 02 '18 at 11:32
  • Thanks! It's always good to know what else causes the same error message. – Lilith-Elina May 03 '18 at 06:36
  • I have taken first approach as below but still same error comes as mentioned in post. let me show my code... import h5py f = h5py.File(weight_path, mode='r') flattened_layers = model.layers nb_layers = f.attrs['nb_layers'] for k in range(nb_layers): g = f['layer_{}'.format(k)] weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])] if not weights: continue f.close() – Rahul Modi Feb 10 '19 at 06:11
  • 1
    Maybe you should post that as a separate question, as your code is very hard to read this way. – Lilith-Elina Feb 11 '19 at 08:41
3

I was working with https://github.com/matterport/Mask_RCNN and produced the same Error:

    Traceback (most recent call last):
  File "detection.py", line 42, in <module>
    model.load_weights(model_path, by_name=True)
  File "/home/michael/Bachelor/important/Cable-detection/Mask_RCNN-2.1/samples/cable/mrcnn/model.py", line 2131, in load_weights
    f = h5py.File(filepath, mode='r')
  File "/home/michael/.local/lib/python3.6/site-packages/h5py/_hl/files.py", line 271, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
  File "/home/michael/.local/lib/python3.6/site-packages/h5py/_hl/files.py", line 101, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/tmp/pip-s_7obrrg-build/h5py/_objects.c:2840)
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/tmp/pip-s_7obrrg-build/h5py/_objects.c:2798)
  File "h5py/h5f.pyx", line 78, in h5py.h5f.open (/tmp/pip-s_7obrrg-build/h5py/h5f.c:2117)
OSError: Unable to open file (Addr overflow, addr = 800, size=8336, eoa=2144)
HDF5: infinite loop closing library
      D,T,F,FD,P,FD,P,FD,P,E,E,SL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL

The solution above worked for me too. I interrupted the training at a random point, therefor the .hdf5 file was not closed properly and could not be opened afterward.

DerOzean
  • 31
  • 2