Multiple arithmetic operations efficiently on NumPy

Question

I have the following data for a Python program.

import numpy as np

np.random.seed(28)

n = 100000
d = 60

S = np.random.rand(n)
O = np.random.rand(n, d, d)
p = np.random.rand()
mask = np.where(S < 0.5)

And I want to run the following algorithm:

def method1():
    sum_p = np.zeros([d, d])
    sum_m = np.zeros([d, d])

    for k in range(n):
        s = S[k] * O[k]
        sum_p += s
        if(S[k] < 0.5):
            sum_m -= s

    return p * sum_p + sum_m

This is a minimal example, but the code in method1() is supposed to be run many times in my project, so I would like to rewrite it in a more pythonic way, to make it as efficient as possible. I have tried with the following method:

def method2():
    sall = S[:, None, None] * O
    return p * sall.sum(axis=0) - sall[mask].sum(axis=0)

But, although this method performs better with low values of d, when d=60 it does not provide good times:

# To check that both methods provide the same result.
In [1]: np.sum(method1() == method2()) == d*d 
Out[1]: True

In [2]: %timeit method1()
Out[2]: 801 ms ± 2.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit method2()
Out[3]: 1.91 s ± 6.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Do you have any other ideas to optimize this method?

(As additional information, the variable mask is supposed to be used in other parts of my final code, so I don't need to consider it inside the code of method2 for the time computation.)

No. I will try it, but my primary interest is to rewrite the code in the most efficient way rather than this. — gtf, Dec 11 '18 at 16:02
It looks like [numpy overhead can be significant](https://stackoverflow.com/questions/53093073/why-is-numpy-sum-10-times-slower-than-the-operator?rq=1) — A.Comer, Dec 11 '18 at 16:55
With these sizes memory management issues dominate. On my modest machine I can't use `method2` because of memory error in the `S*O` step. We've seen in other SO that itertive solutions can be faster when the alternative creates arrays that are too big. — hpaulj, Dec 11 '18 at 17:32

Multiple arithmetic operations efficiently on NumPy

0 Answers0