2

Hello I'm stuck on getting good conversion of a matrix of matlab to pandas dataframe. I converted it but I've got one row in which I've list of list. These list of list are normaly my rows.

import pandas as pd
import numpy as np
from scipy.io.matlab import mio
Data_mat = mio.loadmat('senet50-ferplus-logits.mat')

my Data_mat.keys() gives me this output:

dict_keys(['__header__', '__version__', '__globals__', 'images', 'wavLogits'])

I'd like to convert images and wavLogits to data frame. By looking to this post I applied the solution.

cardio_df = pd.DataFrame(np.hstack((Data_mat['images'], Data_mat['wavLogits'])))

And the output is df

How to get the df in good format?

[UPDATE] Data_mat["images"] has

array([[(array([[array(['A.J._Buckley/test/Y8hIVOBuels_0000001.wav'], dtype='<U41'),
        array(['A.J._Buckley/test/Y8hIVOBuels_0000002.wav'], dtype='<U41'),
        array(['A.J._Buckley/test/Y8hIVOBuels_0000003.wav'], dtype='<U41'),
        ...,
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000007.wav'], dtype='<U41'),
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000008.wav'], dtype='<U41'),
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000009.wav'], dtype='<U41')]],
      dtype=object), array([[     1,      2,      3, ..., 153484, 153485, 153486]], dtype=int32), array([[   1,    1,    1, ..., 1251, 1251, 1251]], dtype=uint16), array([[array(['Y8hIVOBuels'], dtype='<U11'),
        array(['Y8hIVOBuels'], dtype='<U11'),
        array(['Y8hIVOBuels'], dtype='<U11'), ...,
        array(['s4R4hvqrhFw'], dtype='<U11'),
        array(['s4R4hvqrhFw'], dtype='<U11'),
        array(['s4R4hvqrhFw'], dtype='<U11')]], dtype=object), array([[1, 2, 3, ..., 7, 8, 9]], dtype=uint8), array([[array(['A.J._Buckley/1.6/Y8hIVOBuels/1/01.jpg'], dtype='<U37')],
       [array(['A.J._Buckley/1.6/Y8hIVOBuels/1/02.jpg'], dtype='<U37')],
       [array(['A.J._Buckley/1.6/Y8hIVOBuels/1/03.jpg'], dtype='<U37')],
       ...,
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/16.jpg'], dtype='<U36')],
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/17.jpg'], dtype='<U36')],
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/18.jpg'], dtype='<U36')]],
      dtype=object), array([[1.00000e+00],
       [1.00000e+00],
       [1.00000e+00],
       ...,
       [1.53486e+05],
       [1.53486e+05],
       [1.53486e+05]], dtype=float32), array([[3, 3, 3, ..., 1, 1, 1]], dtype=uint8))]],
      dtype=[('name', 'O'), ('id', 'O'), ('sp', 'O'), ('video', 'O'), ('track', 'O'), ('denseFrames', 'O'), ('denseFramesWavIds', 'O'), ('set', 'O')])
abdoulsn
  • 842
  • 2
  • 16
  • 32

1 Answers1

2

So this is what I'd do to convert a mat file into a pandas dataframe automagically.

mat = scipy.io.loadmat('file.mat')
mat = {k:v for k, v in mat.items() if k[0] != '_'}
df = pd.DataFrame({k: np.array(v).flatten() for k, v in mat.items()})
Rainb
  • 1,965
  • 11
  • 32