I have very large .mat file (~ 1.3 GB) that I am trying to load in my Python code (IPython notebook). I tried:
import scipy.io as sio
very_large = sio.loadmat('very_large.mat')
And my laptop with 8 GB RAM hangs. I kept system monitor open and saw that the memory consumption steadily increases to 7 GB and then the system freezes.
What am I doing wrong? Any suggestion / work around?
EDIT:
More details on the data: Here is the link to the data: http://ufldl.stanford.edu/housenumbers/
The particular file of my interest is extra_32x32.mat. From the description : Loading the .mat files creates 2 variables: X which is a 4-D matrix containing the images, and y which is a vector of class labels. To access the images, X(:,:,:,i) gives the i-th 32-by-32 RGB image, with class label y(i).
So for example a smaller .mat file from the same page (test_32x32.mat) when loaded in the following way:
SVHN_full_test_data = sio.loadmat('test_32x32.mat')
print("\nData set = SVHN_full_test_data")
for key, value in SVHN_full_test_data.iteritems():
print("Type of", key, ":", type(SVHN_full_test_data[key]))
if str(type(SVHN_full_test_data[key])) == "<type 'numpy.ndarray'>":
print("Shape of", key, ":", SVHN_full_test_data[key].shape)
else:
print("Content:", SVHN_full_test_data[key])
produces:
Data set = SVHN_full_test_data
Type of y : <type 'numpy.ndarray'>
Shape of y : (26032, 1)
Type of X : <type 'numpy.ndarray'>
Shape of X : (32, 32, 3, 26032)
Type of __version__ : <type 'str'>
Content: 1.0
Type of __header__ : <type 'str'>
Content: MATLAB 5.0 MAT-file, Platform: GLNXA64, Created on: Mon Dec 5 21:18:15 2011
Type of __globals__ : <type 'list'>
Content: []