How to improve loop for on big array in python?

Question

After some researches on StackOverflow, i didn't find a simple answer to my problem. So I share with you my code in order to find some help.

S=np.random.random((495,930,495,3,3))
#The shape of S is (495,930,495,3,3)

#I want to calculate for each small array (z,y,x,3,3) some features
for z in range(S.shape[0]):
    for y in range(S.shape[1]):
        for x in range(S.shape[2]):
            res[z,y,x,0]=np.array(np.linalg.det(S[z,y,x])/np.trace(S[z,y,x]))
            res[z,y,x,1]=np.array(S[z,y,x].mean())
            res[z,y,x,2:]=np.array(np.linalg.eigvals(S[z,y,x]))

Here is my problem. The size of the S array is huge. So I was wondering if it is possible to make this for loop faster.

Every function used is already vectorized in `numpy`(with appropriate arguments or for the last two dimensions): `np.linalg.det(S) / np.trace(S, axis1=-2, axis2=-1)`, `S.mean((-2,-1))`, `np.linalg.eigvals(S)`. No need for a loop, speed up by ~25x. — Michael Szczesny, Sep 27 '22 at 13:45
@MichaelSzczesny that sounds like it should be the real answer. Can you post it with a proper code sample? — knittl, Sep 27 '22 at 17:13
@knittl - No, I won't. There are several problems with this question: *res* is not defined, multiple issues in one question, the solutions are just simple cases of well documented `numpy` functions, the example data is way too large. — Michael Szczesny, Sep 27 '22 at 17:29
@MichaelSzczesny simple perhaps for you, but not for everybody. Documentation can be confusing, difficult to understand, or hard to find. Or it might not be obvious, how to apply the documented behavior to one's own concrete problems. Regarding missing `res`: perhaps an oversight (could have been defined before the posted snippet). How to _properly_ use numpy in this case is definitely not obvious to the OP. — knittl, Sep 27 '22 at 17:50
@max: Can you share the definition of `res`? Or rather a fully runnable code snippet? (Perhaps with a smaller S that doesn't use 10GB of memory) — knittl, Sep 27 '22 at 19:10

knittl · Answer 1 · 2022-09-27T17:56:45.887

I had to reduce the shape to (49,93,49,3,3) so that it runs in acceptable time on my hardware. I was able to shave off 5-10% by avoiding unnecessary work (not optimizing your algorithm). Unnecessary work includes, but is not limited to:

Performing (global) lookups
Calculating the same value several times

You might also want to try a different python runtime, such as PyPy instead of CPython.

Here is my updated version of your script:

#!/usr/bin/python

import numpy as np

def main():
    # avoid lookups
    array = np.array
    trace = np.trace
    eigvals = np.linalg.eigvals
    det = np.linalg.det

    #The shape of S is (495,930,495,3,3)
    shape = (49,93,49,3,3) # so my computer can run it
    S=np.random.random(shape)

    res = np.ndarray(shape) # missing from the question, I hope this is correct

    #I want to calculate for each small array (z,y,x,3,3) some features
    # get shape only once, instead of z times for shape1 and z*y times for shape2
    shape1 = S.shape[1]
    shape2 = S.shape[2]
    for z in range(S.shape[0]):
        for y in range(shape1):
            for x in range(shape2):
                # get value once instead of 4 times
                s = S[z,y,x]
                res[z,y,x,0]=array(det(s)/trace(s))
                res[z,y,x,1]=array(s.mean())
                res[z,y,x,2:]=array(eigvals(s))

# function to have local (vs. global) lookups
main()

Runtime was reduced from 25 to 23 seconds (measured with hyperfine).

Useful references:

I can't reproduce any speed up by using *local variables*. This was an optimization in older python versions. `s = S[z,y,x]` is slightly faster, but it adds up as it's used in each iteration (223293 * 4 times). — Michael Szczesny, Sep 27 '22 at 14:04

How to improve loop for on big array in python?

1 Answers1