fastest way for working with large matrices (i.e 10000x10000 and bigger)?

Question

I am currently using Cython but my code involved taking the norm of each row/column combination multiple times in a loop so even a 1000x1000 matrix is extremely slow. I am thinking of switching to C+ LAPACK.. Whats the standard for working with huge matrices?

The most expensive steps, it looks like, involved taking the dot product of all row/column combinations and looping over the upper triangle of the matrix.

The matrices are not sparse.

Depends on what you are doing and what kind of data is in the matrices. For example, if you have mostly zeros in the matrices, you have what are called "sparse matrices" and there are many well established methods of dealing with these. — chbaker0, Nov 26 '13 at 17:13
the matrices are not sparse. I edited the question with some more details. — methane, Nov 26 '13 at 17:17
A good discussion of numpy speedup: http://technicaldiscovery.blogspot.com/2011/06/speeding-up-python-numpy-cython-and.html — hpaulj, Nov 26 '13 at 19:52

score 2 · Answer 1 · edited May 23 '17 at 12:19

2

Try Numpy, 1000x1000 sounds kinda small for it.
It is implemented in C and can take advantage of linear algebra libraries, so it can be fast. Read this.
If there aren't many entries, Use sparse matrices.

import numpy as np
A = np.random.rand(1000,1000)
np.linalg.norm(A, axis=1)    # takes me under 5 ms

You should provide more details if you want more specific help.

edited May 23 '17 at 12:19

Community

1
1

answered Nov 26 '13 at 17:18

Prashant Kumar

20,069
14
47
63

1

+1, but OP said `10000x10000`, which is pretty big even for numpy. For me, `timeit np.linalg.norm(A, axis=1)` gives `1.59 s per loop` – askewchan Nov 26 '13 at 20:14
Ah, I got lost in all the zeros. 10000x10000 is pretty big, so maybe another venue besides Numpy should be investigated, like C code/libraries. But it is telling that he says 1000x1000 is extremely slow. Sounds to me like the code needs to take better advantage of vectorization. – Prashant Kumar Nov 26 '13 at 20:19
Yeah, I agree. If they get bigger though, it's memory that's an issue not just speed, and something like PyTables might be necessary. – askewchan Nov 26 '13 at 20:21
Hm, actually, `timeit np.sqrt(np.einsum('ij,ij->i', A, A))` gives `147 ms per loop` – askewchan Nov 26 '13 at 20:26
For `n=1e5; np.random.rand(n, n)`, sqrt-einsum takes me just `45 ms per loop`. But using `linalg.norm`, my ipython session dies! – Prashant Kumar Nov 26 '13 at 21:24
Ha, funny that your `einsum` is faster, but my `linalg.norm` at least survives :P – askewchan Nov 26 '13 at 21:28

Dissident penguin · Answer 2 · 2013-11-28T15:09:34.220

In C, as a general guideline and with no sample code to look at, the only thing you can do to optimize matrix operations is making sure you are using contiguous memory blocks so the whole matrices can be kept in the processor's cache (or at least reduce RAM interaction to the possible minimum), i.e. if you are dynamically allocating memory, ask for a whole block of memory for each matrix, and then either handle the indexes arithmetically:

for (i = 0;i < rows; i++)
{
  for (j = 0; j < columns; j++)
  {
     matrix[i*rows + j] = do_whatever();
   }
}

or create a set of pointers to the beginning of your columns if you prefer to use the standard [i][j] notation, although this approach has the potential to reduce performance since processor would have to handle 2 arrays instead of one for a single matrix. If you are using standard arrays, you won't have to worry about it.

The other important change you can make is parallelization of your calculations (multiple threads).

Working with matrices is inherently slow, and optimization tricks can only be applied if certain assumptions about the data can be made, like symmetry or some other property that could save you some operations.

Numpy arrays use contiguous memory blocks automatically, assuming @methane prefers python. — askewchan, Nov 26 '13 at 20:22
Thanks. I edited my post to clarify it is C specific, since OP says (s)he is considering switching to C. — Dissident penguin, Nov 28 '13 at 15:03

fastest way for working with large matrices (i.e 10000x10000 and bigger)?

2 Answers2