how to multiply large matrices without getting memory error

Asked Feb 18 '17 at 21:26

Active Feb 18 '17 at 21:45

Viewed 498 times

I have a matrix X with (140000, 28) dims. Calculating np.transpose(X) * X in python gives memory error. how can I do the multiplication ?

Name:X, Type: Float64, Size:(140000L, 28L)... I think it is ndarray

edited Feb 18 '17 at 21:45

asked Feb 18 '17 at 21:26

Maroun Sassine

How much memory do you have? – user2357112 Feb 18 '17 at 21:30
Is `X` a `np.matrix` or a `np.ndarray`? Can you show us actual code that produces the MemoryError? This shouldn't be a strenuous operation. – user2357112 Feb 18 '17 at 21:32
16 GB on a good day – Maroun Sassine Feb 18 '17 at 21:33
1

Any chance the matrix is sparse (is mostly zeros)? If so you can use numpy's sparse matrix stuff which might help a lot. – Oliver Dain Feb 18 '17 at 21:41
Is X (140000,28) or (140000,1)? If the latter is it a matrix or an array? I suspect the latter in which case you would be asking for a (140000,140000) array which is rather large. Try `@` instead of `*` or `.dot` on older Python – Paul Panzer Feb 18 '17 at 21:44
@PaulPanzer Sorry, it is (140000,28). So X.T * X dim is (28,28) – Maroun Sassine Feb 18 '17 at 21:48
In that case it can't be ndarray because your multiplication would raise an exception. Please be careful in reporting these details because they do make a difference. If you type `type(X)` what do you get? – Paul Panzer Feb 18 '17 at 21:53
@PaulPanzer `` – Maroun Sassine Feb 18 '17 at 22:02
1

Ok, this is actually impossible, so please do me the favour and type the following two lines `X.shape` if that does indeed return (140000,28) then please type your `X.T*X` again it should raise an exception complaining 'operands could not be broadcast together ...' – Paul Panzer Feb 18 '17 at 22:07
@user2357112 should I work with np.matrix or np.array in that case ? – Maroun Sassine Feb 18 '17 at 22:08
@PaulPanzer yes you are right I did a mistake in the comment above, the code line is `np.transpose(X) * X` as written in the question. – Maroun Sassine Feb 18 '17 at 22:11
Possible duplicate of [Very large matrices using Python and NumPy](http://stackoverflow.com/questions/1053928/very-large-matrices-using-python-and-numpy) – Peter Wood Feb 18 '17 at 22:14
1

That shouldn't make a difference. If your `X` is ndarray of shape (140000, 28) then `transpose(X) * X` will raise the same exception about incompatible shapes. – Paul Panzer Feb 18 '17 at 22:14
@PeterWood I don't think so, the matrices here aren't that large, really. – Paul Panzer Feb 18 '17 at 22:17
1

@MarounSassine: Never `np.matrix`. Always use arrays. – user2357112 Feb 18 '17 at 22:47
@user2357112 Seconded. MarounSassine did you try `transpose(X) * X` again with your confirmed ndarray confirmed (140000, 28) shape `X`? I'm not aware of any numpy version where that wouldn't result in a shape mismatch. Please try it and report back. Also your numpy and python versions would be useful. – Paul Panzer Feb 18 '17 at 22:51
@PaulPanzer I did this: `X = scipy.sparse.coo_matrix(X);` `Sb_t = X.T * X;` for (140000,1) it worked, now I will try for (140000,28) – Maroun Sassine Feb 18 '17 at 23:42

how to multiply large matrices without getting memory error

0 Answers0