Writing Python list as HDF5 file using h5py module

Question

I have created a Python list containing random values created using TensorFlow 2.0 which I then want to save as HDF5 file using "h5py" module in Python.

The code I am using is as follows-

# list to hold values-
wts_extracted = []

wts_extracted.append(tf.cast(tf.random.normal(shape = (4, 4), mean = 0, stddev = 1), dtype=tf.float32))
wts_extracted.append(tf.cast(tf.random.normal(shape = (4,), mean = 0, stddev = 1), dtype=tf.float32))
wts_extracted.append(tf.cast(tf.random.normal(shape = (4, 3), mean = 0, stddev = 1), dtype=tf.float32))
wts_extracted.append(tf.cast(tf.random.normal(shape = (3, ), mean = 0, stddev = 1), dtype=tf.float32))
wts_extracted.append(tf.cast(tf.random.normal(shape = (3, 1), mean = 0, stddev = 1), dtype=tf.float32))
wts_extracted.append(tf.cast(tf.random.normal(shape = (1, ), mean = 0, stddev = 1), dtype=tf.float32))

# Convert Python list to numpy array-
n = np.array(wts_extracted)

n.shape
# (6,)

type(n)
# numpy.ndarray

# Save manually created weights as '.h5' file-
with h5py.File("manually_created_weights.h5", "w") as f:
    dset = f.create_dataset("default", data = n)

The error message I get is:

TypeError Traceback (most recent call last) in 1 with h5py.File("manually_created_weights.h5", "w") as f: ----> 2 dset = f.create_dataset("default", data = n)

~/.local/lib/python3.7/site-packages/h5py/_hl/group.py in create_dataset(self, name, shape, dtype, data, **kwds) 134 135 with phil: --> 136 dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) 137 dset = dataset.Dataset(dsid) 138 if name is not None:

~/.local/lib/python3.7/site-packages/h5py/_hl/dataset.py in make_new_dset(parent, shape, dtype, data, chunks, compression, shuffle, fletcher32, maxshape, compression_opts, fillvalue, scaleoffset, track_times, external, track_order, dcpl) 116 else: 117 dtype = numpy.dtype(dtype) --> 118 tid = h5t.py_create(dtype, logical=1) 119 120 # Legacy

h5py/h5t.pyx in h5py.h5t.py_create()

h5py/h5t.pyx in h5py.h5t.py_create()

h5py/h5t.pyx in h5py.h5t.py_create()

TypeError: Object dtype dtype('O') has no native HDF5 equivalent

I read that in order to use "h5py" module, you should convert Pythonic objects to numpy arrays, which I did. What's going wrong?

Thanks!

EDIT:

"h5py" cannot handle different shapes/dimensions, hence, I have the following code:

# Save the different shape arrays as separate datasets:

# Python dict-
wts_extracted_dict = {}

l = 0

while l < len(wts_extracted):
    wts_extracted_dict['layer-{0}'.format(l)] = wts_extracted[l]
    l += 1


# Create '.h5' file on HDD-
f = h5py.File("manual_weights.h5", "w")

# Create a group-
grp = f.create_group("alist")

for k, v in wts_extracted_dict.items():
    grp.create_dataset(k, data=v)

grp
# <HDF5 group "/alist" (6 members)>

f.close()

However, when I try to read the saved file and access the groups as follows:

f = h5py.File("manual_weights.h5", "r")

f.keys()
# <KeysViewHDF5 ['alist']>

f['alist']
# <HDF5 group "/alist" (6 members)>

type(f['alist'])
# h5py._hl.group.Group

How do I access the group?

Check the contents and the dtype of the arrays, the Object dtype is used when a NumPy array can only store references to Python objects (an ndarray of lists, or ndarray of strings, etc.). — AMC, Jan 16 '20 at 15:51
Does this answer your question? [Saving with h5py arrays of different sizes](https://stackoverflow.com/questions/37214482/saving-with-h5py-arrays-of-different-sizes) — AMC, Jan 16 '20 at 15:52
The list arrays differ in size. `h5py` can't save that kind of array. — hpaulj, Jan 16 '20 at 16:27
Look at the `keys` of the group just as you did with the file. `groups` are like dictionaries, and may be nested, but ultimately will contain datasets. `groups` should be adequately described in the the `h5py` docs. — hpaulj, Jan 16 '20 at 17:39
The group `['alist']` has 6 members (datasets created in your `for k, v` loop. You get the dataset names the same way you got group names. Reference the group object (instead of the file object), like this `f['alist'].keys()`. Once you have the dataset names, access them like this: `f['alist']['dataset_name']`or `f['alist/dataset_name']`. — kcw78, Jan 16 '20 at 23:02

Writing Python list as HDF5 file using h5py module

0 Answers0