2

I'm trying to save an array as a HDF5 file using R, but having no luck.

To try and diagnose the problem I ran example(hdf5save). This successfully created a HDF5 file that I could read easily with h5dump.

When I then ran the R code manually, I found that it didn't work. The code I ran was exactly the same as is ran in the example script (except for a change of filename to avoid overwriting). Here is the code:

(m <- cbind(A = 1, diag(4)))
ll <- list(a=1:10, b=letters[1:8]);
l2 <- list(C="c", l=ll); PP <- pi
hdf5save("ex2.hdf", "m","PP","ll","l2")
rm(m,PP,ll,l2)  # and reload them:
hdf5load("ex2.hdf",verbosity=3)
m        # read from "ex1.hdf"; buglet: dimnames dropped
str(ll)
str(l2)

and here is the error message from h5dump:

h5dump error: unable to open file "ex2.hdf"

Does anyone have any ideas? I'm completely at a loss.

Thanks

matt
  • 163
  • 8

2 Answers2

4

I have had this problem. I am not sure of the cause and neither are the hdf5 maintainers. The authors of the R package have not replied.

Alternatives that work

In the time since I originally answered, the hdf5 package has been archived, and suitable alternatives (h5r, rhdf5, and ncdf4) have been created; I am currently usingncdf4`:

  1. Since netCDF-4 uses hdf5 as a storage layer, the ncdf4 package provides an interface to both netCDF-4 and hdf5.
  2. The h5r package with R>=2.10
  3. the rhdf5 package is available on BioConductor.

Workarounds Two functional but unsatisfactory workarounds that I used prior to finding the alternatives above:

  1. Install R 2.7, hdf5 version 1.6.6, R hdf5 v1.6.7, and zlib1g version 1:1.2.3.3 and use this when writing the files (this was my solution until migrating to the ncdf4 library).
  2. Use h5totxt at the command line from the [hdf5utils][1] program (requires using bash and rewriting your R code)

A minimal, reproducible demonstration of the issue:

Here is a reproducible example that sends an error

First R session

library(hdf5)
dat <- 1:10
hdf5save("test.h5","dat")
q()
n # do not save workspace

Second R session:

library(hdf5)
hdf5load("test.h5")

output:

HDF5-DIAG: Error detected in HDF5 library version: 1.6.10 thread
47794540500448.  Back trace follows.
 #000: H5F.c line 2072 in H5Fopen(): unable to open file
   major(04): File interface
   minor(17): Unable to open file
 #001: H5F.c line 1852 in H5F_open(): unable to read superblock
   major(04): File interface
   minor(24): Read failed
 #002: H5Fsuper.c line 114 in H5F_read_superblock(): unable to find file
signature
   major(04): File interface
   minor(19): Not an HDF5 file
 #003: H5F.c line 1304 in H5F_locate_signature(): unable to find a valid
file signature
   major(05): Low-level I/O layer
   minor(29): Unable to initialize object
Error in hdf5load("test.h5") : unable to open HDF file: test.h5
David LeBauer
  • 31,011
  • 31
  • 115
  • 189
  • Thank you for your solutions, I think your second option fits best for me. I couldn't find the hdf5tools program, but I did find [h5utils](http://ab-initio.mit.edu/wiki/index.php/H5utils) which contains a program called [h5fromtxt](http://ab-initio.mit.edu/h5utils/h5fromtxt-man.html) that seems to do what I want. – matt Feb 10 '11 at 09:30
  • 1
    @Matt I have updated my answer, and I meant to say h5utils, a very nice set of tools. I found that `h5totxt test.h5` worked on files for which `h5load('test.h5')` did not (so the writing is only unreadable by R, not necessarily corrupt). – David LeBauer Feb 10 '11 at 17:41
  • It appears that the `hdf5save` command only works occasionally, and there doesn't seem to be and rhyme or reason to it. At least (thanks to you) we now have a reproducible code sample I/we can take it to their bug tracker (or try to fix it, I'm assuming it's Open Source). I'm still at work now, but I'll do it when I get home. Cheers, – matt Feb 10 '11 at 18:31
  • I believe this is related to how on.exit changed post R 2.8. The hdf5cleanup() function is not being called anymore because of it, which is likely leaving the file in a bad state. – frankc Jul 24 '12 at 17:49
  • @frank is there an easy fix? Is there a reason h5totxt still works? – David LeBauer Jul 24 '12 at 18:23
  • @my fix was to use h5r instead of hdf5...i couldn't find a way to fix it without having to change code and it wasn't worth it for me – frankc Jul 24 '12 at 20:02
1

I've also run into the same issue and found a reasonable fix.

The issue seems like it stems from when the hdf5 library finalizes the file. If it doesn't get a chance to finalize the file, then the file is corrupted. I think this happens after the buffer is flushed but the buffer doesn't always flush.

One solution I've found is to do the hdf5save in a separate function. Assign the variables into the globalenv(), then call hdf5save and exit the function. When the function completes, the memory seems to clean up which makes the hdf5 libarary flush the buffer and finalize the file.

Hope this helps!

mindmatters
  • 2,455
  • 3
  • 18
  • 10