I am currently trying to vectorize some code I had written using a large for loop in Python. The vectorized code is as follows:
rho[pi,pj] += (rho_coeff*dt)*i_frac*j_frac
rho[pi+1,pj] += (rho_coeff*dt)*ip1_frac*j_frac
rho[pi,pj+1] += (rho_coeff*dt)*i_frac*jp1_frac
rho[pi+1,pj+1] += (rho_coeff*dt)*ip1_frac*jp1_frac
Each of pi
, pj
, dt
, i_frac
, j_frac
, ip1_frac
, jp1_frac
is a numpy array of one dimension and all of the same length. rho
is a two dimensional numpy array. pi
and pj
make up a list of coordinates (pi
,pj
) which indicate which element of the matrix rho
is modified. The modification involves the addition of the (rho_coeff*dt)*i_frac*j_frac
term to the (pi
,pj
) element as well as addition of similar terms to neighbouring elements: (pi
+1,pj
), (pi
,pj
+1) and (pi
+1,pj
+1). Each coordinate in the list (pi
, pj
) has a unique dt
, i_frac
, j_frac
, ip1_frac
and jp1_frac
associated with it.
The problem is that the list can have (and always will have) repeating coordinates. So instead of successively adding to rho
each time the same coordinate is encountered in the list, it only adds the term corresponding to the last repeating coordinate. This problem is described briefly with an example in the Tentative Numpy Tutorial under fancy indexing with arrays of indices (see the last three examples before boolean indexing). Unfortunately they did not provide a solution to this.
Is there a way of doing this operation without resorting to a for
loop? I am trying to optimize for performance and want to do away with a loop if possible.
FYI: this code forms part of a 2D particle tracking algorithm where the charge from each particle is added to the four adjacent nodes of a mesh surrounding the particle's position based on volume fractions.