0

I have been using the h5py module to input a large MATLAB file, however, I am having some issues with my data, namely accessing specific values from the cell structure and the alphabetical ordering which the module seems to have applied.

I was using the .value function to access specific values which was working fine, however, this has now been deprecated and am now having issues when accessing specific values. In order to circumvent the issue of ordering I have have tried to use track_order = True and set this globally however, this did not solve the problem.

import h5py
import numpy as np

h5py.get_config().track_order=True

myfile = h5py.File('C:/largetempdata.mat',track_order=True)
ref1 = myfile['Day']
ref2 = ref1['Experiment number']
ref3 = ref2['Time']
ref3['midday_data'][()]

At the moment the ref3['midday_data'][()] which outputs

array([[<HDF5 object reference>], dtype=object) as opposed to the actual value which I need to then use further down in the code.

In terms of the ordering labels are being put into alphabetical order, for example I am getting:

['Agroup_data', 'Cgroup_data' 'Miscellaneous', 'Zgroup_data']

instead of the actual ordering present in the .mat file:

['A_group_data', 'C_group_data','Z_group_data','Miscellaneous', ]

Any help or advice with these issues would be greatly appreciated.

kcw78
  • 7,131
  • 3
  • 12
  • 44
  • The use of `value` might be discouraged, but as far as I know still works. I thought `[:]` was the suggested replacement, though `[...]` or `[()]` might be needed to access 0d arrays. But is that actually the issue here? – hpaulj Sep 04 '19 at 16:10
  • http://docs.h5py.org/en/stable/refs.html - for info on object references. – hpaulj Sep 04 '19 at 16:10
  • What labels are you talking about? – hpaulj Sep 04 '19 at 16:12
  • Questioner, you are almost there. An object reference points to another dataset object that has the values. Take a look at this explanation for the SVHN data set in Matlab format: (https://stackoverflow.com/a/55643382/10462884). It deconstructs object references and shows how to use slicing notation to get one object instead of using `.value`. I can expand on that if you share more details about `ref3['midday_data'][()]`. Use **HDFView** from **The HDF Group** to open your .mat file to 'see' how object references work. – kcw78 Sep 04 '19 at 16:14
  • @hpaulj, the matlab HDF5 format with object references is complicated (especially for a new user). First you have to get the object from an array ( `obj = ref3['midday_data'][0][0]` in this example) , then use it to reference the data that's actually in another array (`ds = ref3[obj][:]` or similar). Speaking from experience, it's not so easy the first time. :-) – kcw78 Sep 04 '19 at 16:36
  • Since I only have Octave I can't generate test files to play with this. I also got the sense that MATLAB has gone through a couple of iterations in its HDF5 format. – hpaulj Sep 04 '19 at 16:52
  • Regarding test files -- same here. My only test files are the SVHN data. The "hard part" about Matlab are the object references -- dataset objects pointing to other datasets. Not sure why they designed it this way. It's sure not easy, logical, or intuitive (and more demonic than Pythonic). _It's easy once you master object references._ (he said) – kcw78 Sep 04 '19 at 17:52
  • Apologies for the delay, thank you so much for your comments. At the moment, where previously ,value would return my values I am getting the following error: AttributeError: 'numpy.int32' object has no attribute 'encode'. – Questioner Sep 06 '19 at 09:41
  • In terms of the labels, when I list labels from one of the levels of the cell stricture i.e ref3 = ref2['Time'] Times = list(ref3) The resulting labels are in alphabetical order despite me globally tracking the order at the start. An order field is created but is empty – Questioner Sep 06 '19 at 09:54
  • Was just working through the comments one-by-one, thank you for the references kcw78, I will take a look now :) – Questioner Sep 06 '19 at 09:55
  • Just a quick update, the link you provided has solved my issues with referencing specific values, thank you :) The only problem that now remains the the reordering of the labels – Questioner Sep 06 '19 at 10:35
  • Glad my previous answer helped. I can't help with reordering labels. It's not something I've had to do. Why is order important for your application? Maybe there's another way to accomplish the same result? – kcw78 Sep 06 '19 at 13:20

0 Answers0