0

(Warning - I am a newbie)

I imported mat files using scipy.io:

   data = spio.loadmat('data.mat', struct_as_record=True, squeeze_me=True)
   data = data['b']
   >>> <type 'numpy.void'>

Which gave me a file that is of type numpy.void. Each line has 17 entries that are types string, float, array

print(data.shape)
>>> (11000,)

I have another list of strings which I converted to a numpy.array:

filenames = np.array([filenames])
filenames = np.ndarray.flatten(filenames)
print (filenames.shape)
>>> (11000,)

print(data.dtype)
print(filenames.dtype)

>>> [('fieldname1', 'O'), ('fieldname2', 'O'), ('fieldname3', 'O'), ('fieldname4', 'O'), ('fieldname5', 'O'), ('fieldname6', 'O'), ('fieldname7', 'O'), ('fieldname8', 'O'), ('fieldname9', 'O'), ('fieldname10', 'O'), ('fieldname11', 'O'), ('fieldname12', 'O'), ('fieldname13', 'O'), ('fieldname14', 'O'), ('fieldname15', 'O'), ('fieldname16', 'O'), ('fieldname17', 'O')]
>>> |S16

I want to concatenate these along a column:

NEW = np.concatenate((data, filenames), axis=1)

But I am getting this error:

>>> TypeError: invalid type promotion

Any help would be very much appreciated.

ashley
  • 1,535
  • 1
  • 14
  • 19
  • This question is similar: https://stackoverflow.com/questions/31172991/typeerror-invalid-type-promotion-when-appending-to-a-heterogeneous-numpy-arra but I still get the same error. – ashley Oct 17 '17 at 14:14
  • 1
    `data.dtype` is more useful to us than `type`. `concatenate` does not work to add a field to a structured array. You need something lime `recfunctions.append_field` (I'll lookup details later). – hpaulj Oct 17 '17 at 14:30
  • Thank you @hpaulj I have updated my question. – ashley Oct 17 '17 at 14:56
  • 1
    `append_fields`, https://stackoverflow.com/q/5288736 – hpaulj Oct 17 '17 at 15:26
  • The 'O' dtypes mean the fields are objects. Check the contents. Sometimes loads from matlab can be complex. It has to translate foreign things like cells and structs. – hpaulj Oct 17 '17 at 15:30
  • Thank you @hpauj! that worked. Want to move it to an answer? So that I can accept. It didn't seem to give any trouble with the objects, I think because they are not nested further than one level. I did find this: https://stackoverflow.com/questions/7008608/scipy-io-loadmat-nested-structures-i-e-dictionaries for just in case. – ashley Oct 17 '17 at 19:12

1 Answers1

1

recfunctions is a module with tools for fiddling with structured arrays (and their variant, recarray). It requires separate import. In my experience it is also somewhat buggy.

In [158]: from numpy.lib import recfunctions

Make an array with several object dtype fields:

In [159]: dat = np.empty((3,),dtype=('O,O,O'))
In [160]: dat
Out[160]: 
array([(None, None, None), (None, None, None), (None, None, None)],
      dtype=[('f0', 'O'), ('f1', 'O'), ('f2', 'O')])

After a bit of trial-n-error in calling append_field, this works:

In [161]: names = np.array(['one','two','three'])
In [162]: dat1 = recfunctions.append_fields(dat, 'names', names, usemask=False)
In [163]: dat1
Out[163]: 
array([(None, None, None, 'one'), (None, None, None, 'two'),
       (None, None, None, 'three')],
      dtype=[('f0', 'O'), ('f1', 'O'), ('f2', 'O'), ('names', '<U5')])

But check the contents of the data that's loaded from MATLAB. The .mat may contain structs and cells, which loadmat has to translate into numpy equivalents. To do so it makes extensive use of object dtype arrays.

hpaulj
  • 221,503
  • 14
  • 230
  • 353