0

I have a large sparse matrix (~5 billion non-zero values) in Python, stored in the csc_matrix format. I need to open it as a sparse matrix in Matlab. savemat apparently cannot save data of this size (seems to be capped at ~5GB), so I am resorting to saving it as an hdf5 file, as detailed here. However, I am having trouble opening it in matlab.

Given these three vectors: data, indices, indptr, whose meaning is explained:

standard CSC representation where the row indices for column i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]].

How can I construct this matrix in Matlab? I can open these three vectors in Matlab using h5read no problem, but I don't know how to use them to construct the sparse matrix. This is not the format of the sparse command I usually use to construct a sparse matrix.

Community
  • 1
  • 1
The_Anomaly
  • 2,385
  • 3
  • 18
  • 22
  • 1
    As a stop gap step I'd try the transfer using the `coo` format, data, rows, cols (adjusted for the 0/1 index start). It won't be as compact, but it is probably more compatible. Compared to `scipy`, `MATLAB` seems to hide a lot of the sparse format details. – hpaulj Mar 25 '17 at 21:27
  • @hpaulj that is a huge help, thank you. – The_Anomaly Mar 25 '17 at 21:47

1 Answers1

0

The following code works, but is very slow. Any suggestions would be appreciated.

X=zeros(shape(1),shape(2));
for k=1:length(indptr)-1    
    i=indptr(k)+1:indptr(k+1);
    y=indices(i)+1;
    X(y,k)=data(i);
end