In simplified terms, I think this is what the loops are doing:
upgain=np.array([.1,.2,.3,.4])
u=[]
up=1
for x in upgain:
u1=10*up+x
u.append(u1)
up=u1
producing:
[10.1, 101.2, 1012.3, 10123.4]
np.cumprod([10,10,10,10])
is there, plus a modified cumsum
for the [.1,.2,.3,.4]
terms. But I can't off hand think of a way of combining these with compiled numpy
functions. We could write a custom ufunc
, and use its accumulate
. Or we could write it in cython
(or other c
interface).
https://stackoverflow.com/a/27912352 suggests that frompyfunc
is a way of writing a generalized accumulate
. I don't expect big time savings, maybe 2x.
To use frompyfunc
, define:
def foo(x,y):return 10*x+y
The loop application (above) would be
def loopfoo(upgain,u,u1):
for x in upgain:
u1=foo(u1,x)
u.append(u1)
return u
The 'vectorized' version would be:
vfoo=np.frompyfunc(foo,2,1) # 2 in arg, 1 out
vfoo.accumulate(upgain,dtype=object).astype(float)
The dtype=object
requirement was noted in the prior SO, and https://github.com/numpy/numpy/issues/4155
In [1195]: loopfoo([1,.1,.2,.3,.4],[],0)
Out[1195]: [1, 10.1, 101.2, 1012.3, 10123.4]
In [1196]: vfoo.accumulate([1,.1,.2,.3,.4],dtype=object)
Out[1196]: array([1.0, 10.1, 101.2, 1012.3, 10123.4], dtype=object)
For this small list, loopfoo
is faster (3µs v 21µs)
For a 100 element array, e.g. biggain=np.linspace(.1,1,100)
, the vfoo.accumulate
is faster:
In [1199]: timeit loopfoo(biggain,[],0)
1000 loops, best of 3: 281 µs per loop
In [1200]: timeit vfoo.accumulate(biggain,dtype=object)
10000 loops, best of 3: 57.4 µs per loop
For an even larger biggain=np.linspace(.001,.01,1000)
(smaller number to avoid overflow), the 5x speed ratio remains.