Large matrix multiplication in Python - what is the best option?

Question

I have two boolean sparse square matrices of c. 80,000 x 80,000 generated from 12BM of data (and am likely to have orders of magnitude larger matrices when I use GBs of data).

I want to multiply them (which produces a triangular matrix - however I dont get this since I don't limit the dot product to yield a triangular matrix).

I am wondering what the best way of multiplying them is (memory-wise and speed-wise) - I am going to do the computation on a m2.4xlarge AWS instance which has >60GB of RAM. I would prefer to keep the calc in RAM for speed reasons.

I appreciate that SciPy has sparse matrices and so does h5py, but have no experience in either.

Whats the best option to go for?

Thanks in advance

UPDATE: sparsity of the boolean matrices is <0.6%

do you multiply them as booleans, i.e result is of boolean type? and how sparse is your data, what % of ones? — alko, Dec 04 '13 at 20:21
Yep I multiply them as booleans with 0, 1s hence get numbers in the resulting matrix of 0 or integers greater than 0. How do I check sparseness of my matrices? — user7289, Dec 04 '13 at 20:28
you generated them, you can know from algorithm. You can check amount of ones with `sum()` and divide by total size (6.4*10**9 in you case) — alko, Dec 04 '13 at 20:46
I'm not sure, but I think the result won't be sparse in general and you won't be able to store it in a few dozen GB of RAM. At any rate, check the csr and csc sparse formats in scipy.sparse. — jorgeca, Dec 05 '13 at 06:57

Roland Smith · Answer 1 · 2013-12-11T00:00:35.907

1

If your matrices are relatively empty it might be worthwhile encoding them as a data structure of the non-False values. Say a list of tuples describing the location of the non-False values. Or a dictionary with the tuples as the keys.

If you use e.g. a list of tuples you could use a list comprehension to find the items in the second list that can be multiplied with an element from the first list.

a = [(0,0), (3,7), (5,2)] # et cetera
b = ... # idem

for r, c in a:
    res = [(r, k) for j, k in b if k == j]

edited Dec 11 '13 at 00:00

answered Dec 07 '13 at 15:00

Roland Smith

42,427
3
64
94

See above additional answer showing this is very time consuming if the data set is large. I'm guessing that if it's on the order of several hundred datapoints or fewer, methinks this probably is reasonably quick. – Kevin J. Rice Dec 30 '13 at 18:08

score -1 · Answer 2 · edited May 23 '17 at 12:06

-- EDITED TO SATISFY BELOW COMMENT / DOWNVOTER --

You're asking how to multiply matrices fast and easy.

SOLUTION 1: This is a solved problem: use numpy. All these operations are easy in numpy, and since they are implemented in C, are rather blazingly fast.

also see:

SciPy and Numpy have sparse matrices and matrix multiplication. It doesn't use much memory since (at least if I wrote it in C) it probably uses linked lists, and thus will only use the memory required for the sum of the datapoints, plus some overhead. And, it will almost certainly be blazingly fast compared to pure python solution.

SOLUTION 2

Another answer here suggests storing values as tuples of (x, y), presuming value is False unless it exists, then it's true. Alternate to this is a numeric matrix with (x, y, value) tuples.

REGARDLESS: Multiplying these would be Nasty time-wise: find element one, decide which other array element to multiply by, then search the entire dataset for that specific tuple, and if it exists, multiply and insert the result into the result matrix.

SOLUTION 3 ( PREFERRED vs. Solution 2, IMHO )

I would prefer this because it's simpler / faster.

Represent your sparse matrix with a set of dictionaries. Matrix one is a dict with the element at (x, y) and value v being (with x1,y1, x2,y2, etc.):

matrixDictOne = { 'x1:y1' : v1, 'x2:y2': v2, ... }
matrixDictTwo = { 'x1:y1' : v1, 'x2:y2': v2, ... }

Since a Python dict lookup is O(1) (okay, not really, probably closer to log(n)), it's fast. This does not require searching the entire second matrix's data for element presence before multiplication. So, it's fast. It's easy to write the multiply and easy to understand the representations.

SOLUTION 4 (if you are a glutton for punishment)

Code this solution by using a memory-mapped file of the required size. Initialize a file with null values of the required size. Compute the offsets yourself and write to the appropriate locations in the file as you do the multiplication. Linux has a VMM which will page in and out for you with little overhead or work on your part. This is a solution for very, very large matrices that are NOT SPARSE and thus won't fit in memory.

Note this solves the complaint of the below complainer that it won't fit in memory. However, the OP did say sparse, which implies very few actual datapoints spread out in giant arrays, and Numpy / SciPy handle this natively and thus nicely (lots of people at Fermilab use Numpy / SciPy regularly, I'm confident the sparse matrix code is well tested).

Did you read through the question? Matrices from OP question can be too large to be stored in RAM. And **any known** matrix multiplication algorithm is not better than O(n^2.7), which is a huge number for OP case. — alko, Dec 27 '13 at 21:09
**Disagree**: a Sparse matrix does not allocate an array of m*n. It only allocates memory used by the __actual number of elements__. The OP references a **sparse** matrix that's very large. Coding up a sparse matrix in Python can be nasty. Since SciPy / Numpy use C language optimized arrays, probably of linked lists which would optimize memory, this is quite possible. Any mostly-empty sparse matrix multipled by another mostly-empty sparse matrix should definitely fit in memory. Besides, on a linux system, the VMM can map memory onto disk and it will work nicely. — Kevin J. Rice, Dec 30 '13 at 17:41
Worth noting that scipy isn't actually "implemented in `C`", being mostly comprised of `C++` and `FORTRAN`. — Slater Victoroff, Dec 30 '13 at 18:15
Kevin, note that sparsity is specified at 0.6% of data, that in fact is not very sparse. Which is why I think that when orders of magnitude will grow as specified, and 8*10**6x8*10**6 matrix will be at hand, it will conatain 0.006*6.4*10**13=3.84*10**11 non zero values, which is huge number, and won't fit in memory. Previous size 8*10**5 is boundary in this aspect, and one has to be careful. Leaving aside power law of multiplication complexity — alko, Dec 30 '13 at 18:16

Large matrix multiplication in Python - what is the best option?

2 Answers2