Thanks for the question, sorry it's been a while until I got round to answer it.
This was impossible with the version of rhdf5 available at the time, and also required a slightly different approach. The H5T_NATIVE_HBOOL
datatype is just a mapping to an unsigned 8-bit int (at least on Linux).
To create the enum datatype you're looking for, you have to create a custom datatype using H5Tenum_create()
, and then set the mapping (e.g. TRUE = 1) using H5Tenum_insert()
.
Here's an example. You'll need rhdf5 version 2.43.1 or newer, which you can get from https://github.com/grimbough/rhdf5
library(rhdf5)
## our input data. Note we're using 1 & 0
## but TRUE/FALSE would also work in this example
dat <- c(1, 1, 0, 1)
## create an HDF5 file
file <- tempfile(fileext = ".h5")
h5file = H5Fcreate(file)
## create the dataspace for our new data
h5space = H5Screate_simple(dims = dim(dat), NULL, native = TRUE)
## create the enum datatype with our mapping
## TRUE = 1 FALSE = 0
tid <- H5Tenum_create(dtype_id = "H5T_NATIVE_UCHAR")
H5Tenum_insert(tid, name = "TRUE", value = 1L)
H5Tenum_insert(tid, name = "FALSE", value = 0L)
## create the dataset with this new
h5dataset1 = H5Dcreate(h5file, "dataset1", tid, h5space)
## write the data. We have to use as.raw() because our
## base type is 8-bit and R integers are 32-bit
H5Dwrite(h5dataset1, as.raw(dat), h5type = tid)
## tidy up
h5closeAll()
We can use the h5ls
command line tool to check our datatype is 8-bit enum and we have the (0=FALSE, 1=TRUE) mapping.
system2("h5ls", args = c("-v", file))
#> Opened "/tmp/Rtmp4zU9m5/file36f657af24dcb.h5" with sec2 driver.
#> dataset1 Dataset {4/4}
#> Location: 1:800
#> Links: 1
#> Storage: 4 logical bytes, 4 allocated bytes, 100.00% utilization
#> Type: enum native unsigned char {
#> TRUE = 1
#> FALSE = 0
#> }
We can also read it back into R.
## we can read it back in and get a factor
h5read(file, name = "/dataset1")
#> [1] TRUE TRUE FALSE TRUE
#> Levels: TRUE FALSE
I don't love this, because your aren't getting back exactly what you wrote.