6

I did some research and I stored the results in an HDF5 file using the h5py module. I opened and read the data a bunch of times using both the h5py module and the HDF view tool from the HDF5 group. This all worked fine, until one day my computer crashed while the file was open in HDF view.

After rebooting the pc I could no longer open the file. The HDF view tool shows a generic error: "Error opening file "

I wrote the file in h5py, so I decided to try and use this for reading the data aswell. The file was written in swmr mode with libver='latest'. I tried the following:

with h5py.File(fpath, 'r', swmr=True, libver='latest') as f:
    pass

Returns an error "OSError: Unable to open file (file is not already open for SWMR writing)"

with h5py.File(fpath, 'r') as f:
    pass

Returns an error "OSError: Unable to open file (file is already open for write (may use h5clear file to clear file consistency flags))"

Now I'm wondering, is the h5clear option implemented in the h5py module yet? I cannot find any information about this anywhere.

Edit: Removed the file (sorry)

Alex
  • 941
  • 3
  • 11
  • 23
  • Regarding h5clear option, ask the developers if they have implemented it. The h5py users forum is at: https://groups.google.com/forum/#!forum/h5py – kcw78 Nov 29 '18 at 14:29
  • Indeed, you may have a corrupted file. Have you tried any of the batch programs to read it (h5dump from HDFGroup, or ptdump delivered with the PyTables module)? – kcw78 Nov 29 '18 at 14:33
  • Thanks for the replies, I made a post on the HDF forums at https://groups.google.com/forum/#!topic/h5py/zcyB2tNQ6Eo – Alex Nov 29 '18 at 14:41
  • Thanks for the suggestions. I'm currently trying to figure out how to use the HDF tool in Visual Studio, since my programming experience is unfortunately limited to Python right now – Alex Nov 29 '18 at 14:43
  • Looks like `h5clear` is a user utility that you run from OS, like `h5dump` and `h5stats`. https://support.hdfgroup.org/HDF5/doc/RM/Tools.html – hpaulj Nov 29 '18 at 17:45
  • Thank you, you are correct, you solved my corrupt file issues! – Alex Nov 29 '18 at 21:33
  • 1
    `h5clear` command not found. I have `hdf5 1.10.0-patch1`. Other command line utils like `h5dump` , `h5repack` work fine. Is there any way I can clear or reset the flags using the available utils? (Ubuntu Bionic Beaver btw) – hridayns Jan 29 '19 at 05:40

1 Answers1

3

Given a hdf5 file that throws this error, Unable to open file (file is already open for write/SWMR write), where you do not have a way to recreate the file, you can clear the file consistency flag using the command line tool h5clear.

$> h5clear -s my_bad.h5

One way to get the h5clear utility (on Windows 10 or any other OS) is by installing h5py (or pandas, not sure which was responsible) using the Anaconda Python distribution. On my system, the executable was located in the environment bin directory: anaconda3/envs/my_env/Library/bin/h5clear. I expect you can also get this utility by installing h5py from pip though I have not tested this.

If you have Anaconda installed, you can create an environment, install the packages, then run h5clear using the following commands from the command line. On Windows, I use git-bash but this should also work from the Anaconda Prompt, or even the Windows Command Prompt if you setup your path correctly.

$> conda create --name demo

$> source activate demo

(demo) 
$> conda install h5py pandas

(demo)
$> h5clear -s my_bad.h5
Steven C. Howell
  • 16,902
  • 15
  • 72
  • 97
  • How did you manage to clear an SWMR flag using pandas or h5py? I spent a long time looking for a solution like this and came to the conclusion it was not possible. I ended up clearing the file from the command line using the hdf5 library (outside of python). – Alex Jan 30 '19 at 08:46
  • From the command line, I activated the Anaconda Python environment that I installed pandas and h5py into. Then I had access to the command line tool, `h5clear`. So from the command line I raw the code I listed in my response. – Steven C. Howell Jan 30 '19 at 18:04
  • My conda prompt seems to just call the HDF5 package (not the one installed in python) when I do that – Alex Jan 31 '19 at 12:31
  • You can always specify the specific `h5clear` executable in the command line, either by typing the relative/absolute path, or go to the folder `h5clear` is in and run it from there. As an example of the first option, in a bash environment (e.g. git-bash on Windows), you could run `/c/Users/myself/AppData/Local/Continuum/anaconda3/envs/my_env/Library/bin/h5clear -s my_bad.h5`. In a Command Prompt, you switch the path to Windows style, but otherwise it should work the same. – Steven C. Howell Feb 07 '19 at 15:02
  • Yeah exactly, but as I mentioned before that's not really a solution within python, you need to download the actual HDF5 utilities to be able to do this. – Alex Feb 07 '19 at 19:03
  • I did not have to download anything other than a Python library, either pandas or h5py. As part of installing that Python library, the executable was added to the bin directory I have in my answer. There was no need to install a separate HDF5 utility. And if you really want, [you can execute command line utilities directly from Python](https://stackoverflow.com/a/89243/3585557). – Steven C. Howell Feb 09 '19 at 00:49