0

This is based on a previous question of mine Replacing for loops with function call inside with broadcasting/vectorized solution but is more complicated, and the additional complications have me at a loss.

Here are some similar posts, however, I was not able to get my problem to work after reading the solutions to their problems:

Broadcasting custom function

Broadcasting function calls in np arrays

Broadcasting a function call to a a 3D array in python

Broadcasting a python function onto np arrays

Here is the MWE

def OneD(x, y, z, r_1, r_2):
    ret = 0.0
    for i in range(x+1):
        for j in range(y+1):
            m = i + j
            if m % 2 == 0:
                ret += np.exp(i+r_1)**(j+1) / (z+1+r_2)
    return ret 

def ThreeD(a,b,c, d, e):
    value = OneD(a[0],b[0], c, d[0], e[0])
    value *= OneD(a[1],b[1], c, d[1], e[1])
    value *= OneD(a[2],b[2], c, d[2], e[2])    
    return value

M1 = M2 = np.array(([[0,0,0],[0,0,1], [1,1,1], [1,0,2]]), dtype=int) 
scales0 = scales1 = np.array([1.1, 2.2, 3.3, 4.4])
cc0 = cc1 = cc2 =1.77    # for simplicity, all constants made the same
r1 = np.array([0.0,0.0,0.0])
r2 = r1 + 1.0   
results = np.zeros((4,4))

for s0, n0, in enumerate(M1):
    for s1, n1, in enumerate(M2):
        v = ThreeD(n0, n1, cc2, r1, r2)
        v *= cc0 * cc1 * scales0[s0] * scales1[s1]
        results[s0, s1] += v

Similar to the vectorized/Broadcasted version created for the simpler problem linked at the start, I am trying to get rid of the for loops and speed up the calculation by performing vectorized calls to function OneD

v  = np.prod(OneD(M1[:,None,:], M2[None,:,:], np.arange(4)[None,:,None], r1, r2), axis=2)
results = v * cc0*cc1*np.array(scales0)[:,None]*np.array(scales1)[None,:] 

The problem I have is that now, when using OneD, scalars are not being passed, but instead arrays are appearing. This tells me I need to pad the arrays further (maybe go up 2 dimensions?)? I am struggling with padding the arrays correctly. I think I will need to make the arrays have more dimensions, and then squeeze them after, but I am unclear on how to do this. My numerous attempts have been wrong, and always keep on ending up sending arrays to OneD. For this example I need to end up with a (4,4) array.

[[  7.07469713e-02   1.41493943e-01   2.12240914e-01   5.65975771e-01]                                                                                                          
 [  1.41493943e-01   2.37400124e+00   3.56100187e+00   5.31397826e+00]                                                                                                          
 [  2.12240914e-01   3.56100187e+00   3.75915002e+02   6.68688926e+01]                                                                                                          
 [  2.37400124e+00   8.93002921e+00   1.12371774e+02   3.99028687e+03]]
Charlie Crown
  • 1,071
  • 2
  • 11
  • 28
  • In my previous answer, your `OneD` worked with arrays. Just `np.exp(x)**(y+1) / (z+1)`. You need to focus on writing this new `OneD` in a similar manner. – hpaulj Jun 22 '20 at 19:39
  • I think it is vectorized already. each variable that I am sending to `ThreeD` is a vector of length 3, with the exception of 1 variable which is a scalar. So I should be able to skip `ThreeD` and use a vectorized `OneD` – Charlie Crown Jun 23 '20 at 06:57
  • With the previous `OneD` you could pass array `a` etc to `OneD`. Here that would raise an error, since `range(x+1)` requires a scalar `x`. The `if` will also raise an error if given an array. – hpaulj Jun 23 '20 at 07:02
  • @hpaulj I wouldn't mind your opinion on my solution. I think I nailed it :) – Charlie Crown Jul 05 '20 at 05:30

1 Answers1

1

The solution is to make use of np.frompyfunc. It can be used as a direct replacement of np.vectorize, but is not hella slow like np.vectorize

Solution

myfunc3 = np.frompyfunc(OneD,5,1)

v  = np.prod(myfunc3(M1[:,None,:], M2[None,:,:], cc2, r1, r2), axis=2)
results = v * cc0*cc1*np.array(scales0)[:,None]*np.array(scales1)[None,:]

For anyone interested, below are a bunch of different trial method, The solution is the last method

Test suite

import numpy as np
from timeit import default_timer as tm
############################################################################
#
#   Original code uses ThreeD and OneD in a double for loop
#     My goal is to first, get rid of ThreeD
#                   Second, get rid of for loops
#             Progress: checkmark to all. Final trial uses np.frompyfunc
#                       np.frompyfunc gets rid of double for loops.
#
###########################################################################

def OneD(x, y, z, r_1, r_2):
    ret = 0.0
    for i in range(x+1):
        for j in range(y+1):
            m = i + j
            if m % 2 == 0:
                ret += np.exp(i+r_1)**(j+1) / (z+1+r_2)
    return ret 

def ThreeD(a,b,c, d, e):
    value =  OneD(a[0],b[0], c, d[0], e[0])
    value *= OneD(a[1],b[1], c, d[1], e[1])
    value *= OneD(a[2],b[2], c, d[2], e[2])    
    return value

#######################################################
#
#   Variables used by all trials
#
#######################################################
M1 = M2 = np.array(([[0,0,0],[0,0,1], [0,1,0], [1,0,0], [2,0,0], [0,2,0],[0,0,2], [1,1,0],[1,0,1],[0,1,1]]), dtype=int) 
scales0 = scales1 = np.random.rand(len(M1))*10.0 # np.array([1.1, 2.2, 3.3, 4.4])
print(scales0)
cc0 = cc1 = cc2 =1.77    # for simplicity, all constants made the same
r1 = np.array([0.0,0.0,0.0])
r2 = r1 + 1.0   

#################################################################
#
#   Original double for loop method - base case
#  No dessert until it is slower than at least one other method
#
#################################################################
results = np.zeros((len(M1),len(M2)))
sb = tm()
for s0, n0, in enumerate(M1):
    for s1, n1, in enumerate(M2):
        v = ThreeD(n0, n1, cc2, r1, r2)
        v *= cc0 * cc1 * scales0[s0] * scales1[s1]
        results[s0, s1] += v
e1 = tm()   
print(results)
answer_longway = np.sum(results)
print("Value for long way is {} and time is {}".format(answer_longway, e1-sb) )

######################################################################
######################################################################
#
#                      np.vectorize (a slow way)
#
######################################################################
######################################################################
results = np.zeros((len(M1),len(M2)))
myfunc = np.vectorize(OneD)
v = 0
s2 = tm()
for s0, n0, in enumerate(M1):
    for s1, n1, in enumerate(M2):
        v = np.prod(myfunc(n0, n1, cc2, r1, r2))
        v *= cc0 * cc1 * scales0[s0] * scales1[s1]
        results[s0, s1] += v
e2 = tm()
answer_np_vectorize = np.sum(results)
print("Value for np.vectorize way is {} and time is {}".format(answer_np_vectorize, e2-s2))

######################################################################
######################################################################
#
#                      np.frompyfunc (a same as loop way)
#
######################################################################
######################################################################
results = np.zeros((len(M1),len(M2)))
v = 0
myfunc2 = np.frompyfunc(OneD,5,1)
s3 = tm()
for s0, n0, in enumerate(M1):
    for s1, n1, in enumerate(M2):
        v = np.prod(myfunc2(n0, n1, cc2, r1, r2))
        v *= cc0 * cc1 * scales0[s0] * scales1[s1]
        results[s0, s1] += v
e3 = tm()
answer_np_vectorize = np.sum(results)
print("Value for np.frompyfunc way is {} and time is {}".format(answer_np_vectorize, e3-s3))

######################################################################
######################################################################
#
#                      broadcasted (a fast way?)
#
######################################################################
######################################################################
results = np.zeros((len(M1),len(M2)))
myfunc3 = np.frompyfunc(OneD,5,1)

s4 = tm()
v  = np.prod(myfunc3(M1[:,None,:], M2[None,:,:], cc2, r1, r2), axis=2)
results = v * cc0*cc1*np.array(scales0)[:,None]*np.array(scales1)[None,:]
e4 = tm()

answer_broadcast = np.sum(results)
print("Value for broadcasted way is {} and time is {}".format(answer_broadcast, e4-s4))

A sample output is

Value for long way is 2069.471054878208 and time is 0.00458109400351532                                                       
Value for np.vectorize way is 2069.471054878208 and time is 0.01092407900316175                                               
Value for np.frompyfunc way is 2069.471054878208 and time is 0.00398122602999792434                                             
Value for broadcasted way is 2069.471054878208 and time is 0.0023705440033460036 

Note:

M1 and M2 are a bigger in the trial runs than the original question.

Charlie Crown
  • 1,071
  • 2
  • 11
  • 28