50

I'm trying to overwrite a numpy array that's a small part of a pretty complicated h5 file.

I'm extracting an array, changing some values, then want to re-insert the array into the h5 file.

I have no problem extracting the array that's nested.

f1 = h5py.File(file_name,'r')
X1 = f1['meas/frame1/data'].value
f1.close()

My attempted code looks something like this with no success:

f1 = h5py.File(file_name,'r+')
dset = f1.create_dataset('meas/frame1/data', data=X1)
f1.close()

As a sanity check, I executed this in Matlab using the following code, and it worked with no problems.

h5write(file1, '/meas/frame1/data', X1);

Does anyone have any suggestions on how to do this successfully?

user3508433
  • 501
  • 1
  • 4
  • 3

3 Answers3

52

You want to assign values, not create a dataset:

f1 = h5py.File(file_name, 'r+')     # open the file
data = f1['meas/frame1/data']       # load the data
data[...] = X1                      # assign new values to data
f1.close()                          # close the file

To confirm the changes were properly made and saved:

f1 = h5py.File(file_name, 'r')
np.allclose(f1['meas/frame1/data'].value, X1)
#True
askewchan
  • 45,161
  • 17
  • 118
  • 134
  • 23
    That `data[...] = X1` is critical! Don't make the mistake of doing `data = X1`. – pattivacek Jan 13 '17 at 22:38
  • The second block is only to check whether everything is closed, right? So the first 4 lines do the actual job. – Dusch Jun 29 '17 at 09:18
  • 1
    Another tripwire is that you have to overwrite the full individual data point; so if your data is e.g. 3D points you have to update the full 3D point. Changes to one element of the data point (e.g. only the Z-Coordinate) will not change anything – Oliver Zendel Sep 20 '17 at 15:49
  • @Dusch, You're right that it's just a check and that the first four lines do the job, but actually it does not close anything it checks whether the values match. See [`np.allclose`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.allclose.html) – askewchan Sep 20 '17 at 17:25
  • Is appending [...] necessary if we are only updating part of the dataset? e.g. if we were to do data[5:] = X1, where len(data) = len(X1)+5 ? – Bremsstrahlung May 01 '20 at 19:41
43

askewchan's answer describes the way to do it (you cannot create a dataset under a name that already exists, but you can of course modify the dataset's data). Note, however, that the dataset must have the same shape as the data (X1) you are writing to it. If you want to replace the dataset with some other dataset of different shape, you first have to delete it:

del f1['meas/frame1/data']
dset = f1.create_dataset('meas/frame1/data', data=X1)
Community
  • 1
  • 1
weatherfrog
  • 2,970
  • 2
  • 19
  • 17
3

Different scenarios:

  1. Partial changes to dataset
with h5py.File(file_name,'r+') as ds:
  ds['meas/frame1/data'][5] = val # change index 5 to scalar "val"
  ds['meas/frame1/data'][3:7] = vals # change values of indices 3--6 to "vals"
  1. Change each value of dataset (identical dataset sizes)
with h5py.File(file_name,'r+') as ds:
  ds['meas/frame1/data'][...] = X1 # change array values to those of "X1"
  1. Overwrite dataset to one of different size
with h5py.File(file_name,'r+') as ds:
  del ds['meas/frame1/data'] # delete old, differently sized dataset
  ds.create_dataset('meas/frame1/data',data=X1) # implant new-shaped dataset "X1"

Since the File object is a context manager, using with statements is a nice way to package your code, and automatically close out of your dataset once you're done altering it. (You don't want to be in read/write mode if you only need to read off data!)

Dharman
  • 30,962
  • 25
  • 85
  • 135