kitsune_breeze, I reviewed the Q&A and the comments. There are several areas that need to be clarified. Let's start with external links versus object or region references.
As I understand you want to create a dataset (aka an array) of external links (with each link referencing a different HDF5 file).
The answer from Mahsa Hassankashi on 19-April describes how to create a dataset of dtype=h5py.ref_dtype
or dtype=h5py.regionref_dtype
. The first is an object reference, and the second is a region reference. They are not the same as external links! Also, the example code requires h5py 2.10.0
and you are using h5py 2.9.0.
. (FYI, there is a solution to this in 2.9.0 if you choose to use object or region references.)
Here's the bad news: based on my tests, you can't create a dataset (or np array) of HDF5 external links. Here are the steps to see why:
In [1]: import h5py
In [2]: h5fw = h5py.File('SO_61290760.h5',mode='w')
# create an external link object
In [3]: link_obj = h5py.ExternalLink('file1.h5','/')
In [4]: type(link_obj)
Out[4]: h5py._hl.group.ExternalLink
In [5]: link_dtype = type(link_obj)
In [6]: h5fw.create_dataset("MyRefs", (10,), dtype=link_dtype)
Traceback (most recent call last):
...
TypeError: Object dtype dtype('O') has no native HDF5 equivalent
Reading the h5py documentation, it appears object and region references are also dtype('O')
datatypes, and required additional metadata to implement them. There is no mention that this was done for External Links. As a result, I don't think you can create an array of External Links (because there isn't a dtype to support them).
That said, you can still create External Links from 1 HDF5 file to multiple HDF5 files. I have a simple example here (look under Method 1: Create External Links).
How can I combine multiple .h5 file?
If you decide to use Object or Region References, you need to use a different dtype specification in h5py 2.9.0.
Object Reference:
2.10.0 use: h5py.ref_dtype
2.9.0 use: h5py.special_dtype(ref=h5py.Reference)
Region Reference:
2.10.0 use: h5py.regionref_dtype
2.9.0 use: h5py.special_dtype(ref=h5py.RegionReference)
Code below demonstrates the behavor in 2.9.0:
In [9]: type(h5py.ref_dtype)
Traceback (most recent call last):
...
AttributeError: module 'h5py' has no attribute 'ref_dtype'
In [10]: type(h5py.special_dtype(ref=h5py.Reference))
Out[10]: numpy.dtype
In [11]: type(h5py.regionref_dtype)
Traceback (most recent call last):
...
AttributeError: module 'h5py' has no attribute 'regionref_dtype'
In [12]: type(h5py.special_dtype(ref=h5py.RegionReference))
Out[12]: numpy.dtype
In [13]: dset = h5fw.create_dataset("MyRefs", (10,), dtype=h5py.special_dtype(ref=h5py.Reference))
In [14]: dset.dtype
Out[14]: dtype('O')