1

I have the following data for a Python program.

import numpy as np

np.random.seed(28)

n = 100000
d = 60

S = np.random.rand(n)
O = np.random.rand(n, d, d)
p = np.random.rand()
mask = np.where(S < 0.5)

And I want to run the following algorithm:

def method1():
    sum_p = np.zeros([d, d])
    sum_m = np.zeros([d, d])

    for k in range(n):
        s = S[k] * O[k]
        sum_p += s
        if(S[k] < 0.5):
            sum_m -= s

    return p * sum_p + sum_m

This is a minimal example, but the code in method1() is supposed to be run many times in my project, so I would like to rewrite it in a more pythonic way, to make it as efficient as possible. I have tried with the following method:

def method2():
    sall = S[:, None, None] * O
    return p * sall.sum(axis=0) - sall[mask].sum(axis=0)

But, although this method performs better with low values of d, when d=60 it does not provide good times:

# To check that both methods provide the same result.
In [1]: np.sum(method1() == method2()) == d*d 
Out[1]: True

In [2]: %timeit method1()
Out[2]: 801 ms ± 2.98 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [3]: %timeit method2()
Out[3]: 1.91 s ± 6.17 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Do you have any other ideas to optimize this method?

(As additional information, the variable mask is supposed to be used in other parts of my final code, so I don't need to consider it inside the code of method2 for the time computation.)

gtf
  • 11
  • 2
  • 2
    Did you try numba to compile it on the fly? – Matthieu Brucher Dec 11 '18 at 15:47
  • No. I will try it, but my primary interest is to rewrite the code in the most efficient way rather than this. – gtf Dec 11 '18 at 16:02
  • It looks like [numpy overhead can be significant](https://stackoverflow.com/questions/53093073/why-is-numpy-sum-10-times-slower-than-the-operator?rq=1) – A.Comer Dec 11 '18 at 16:55
  • 2
    With these sizes memory management issues dominate. On my modest machine I can't use `method2` because of memory error in the `S*O` step. We've seen in other SO that itertive solutions can be faster when the alternative creates arrays that are too big. – hpaulj Dec 11 '18 at 17:32
  • I think method 1 is near optimal – B. M. Dec 11 '18 at 19:47

0 Answers0