Vectorizing nested loop with conditionals and functions

Question

I have the following function:

def F(x):    #F receives a numpy vector (x) with size (xsize*ysize)

    ff = np.zeros(xsize*ysize)

    count=0 

    for i in range(xsize):
       for j in range(ysize):

           a=function(i,j,xsize,ysize)

           if (a>xsize):
               ff[count] = x[count]*a
           else
               ff[count] = x[count]*i*j

           count = count +1

    return ff

There is one nuance here which is the fact that (example for xsize =4, ysize=3)

c=count
x[c=0] corresponds to x00 i=0,j=0
x[c=1]   x01   i=0, j=1
x[c=2]   x02   i=0, j=2   (i=0,  j = ysize-1)
x[c=3]   x10   i=1, j=0
...   ...  ...
x[c=n]   x32    i=3 j=2  (i=xsize-1, j=ysize-1)

My code is simply

ff[c] = F[x[c]*a (condition 1)
ff[c] = F[x[c]*i*j (condition 2)

I could avoid the nested loop using broadcasting as explained in this link:

Python3: vectorizing nested loops

But in this case I have to call the function(i,j,xsize,ysize) and then I have conditionals. I really need to know the value of i and j.

Is it possible to vectorize this function?

Edit: the function(i,j,xsize,ysize) will use sympy to perform symbolic calculations to return a float. So a is a float, not a symbolic expression.

WHat does `function` do? It’s fairly essential to see if this can be vectorized. — Oliver W., Dec 11 '19 at 23:40
should `count += 1` be indented to the inner loop? right now it is only in the outer loop which means most of your assignments are overriding themselves and not all of `ff` is initialized. — Tadhg McDonald-Jensen, Dec 11 '19 at 23:43
The function is relatively complex. It involves symbolic calculations with dot products and matrix multiplications. — AA10, Dec 11 '19 at 23:43
`symbolic calculations`? as with `sympy`? if so, adjust the tags and example. — hpaulj, Dec 11 '19 at 23:47
@AA10. Matrix multiplication and division product can be vectorized easily. Symbolic stuff not so much probably — Mad Physicist, Dec 11 '19 at 23:48
To apply conditions in a 'vectorized' fashion you have to construct a mask or masks, which is True where you want the assignment or action,and False where not. The whole point to `numpy` vectorization is to think in terms of whole-array operations. Often that requires stepping back from the element-by-element focus. — hpaulj, Dec 11 '19 at 23:49
everyone, `function` only depends on the **size** of the input. all that is needed here is a cache of the functions results for a given size, then reuse that for different values of `x` of same size. — Tadhg McDonald-Jensen, Dec 11 '19 at 23:50
@hpaulj `sympy`, yes. For example one of the first lines is `v = vx*N.x + vy*N.y` where vx and vy are determined by `i` and `j`. — AA10, Dec 11 '19 at 23:52
IN https://stackoverflow.com/questions/59242194/attributeerror-mul-object-has-no-attribute-sqrt, I demonstrate how `sympy` and `numpy` can play together; it isn't trivial. — hpaulj, Dec 12 '19 at 00:00
@hpaulj but is it problematic if the functions returns a float? — AA10, Dec 12 '19 at 00:06
If `function` involves symbolic calculations with SymPy then there will not be any *performance* benefit from vectorisations. — Oscar Benjamin, Dec 12 '19 at 00:31

Tadhg McDonald-Jensen · Accepted Answer · 2019-12-12T02:33:48.490

The first thing to note is that your function F(x) can be described as x(idx) * weight(idx) for each index, where weight is only dependent on the dimensions of x. So lets structure our code in terms of a function get_weights_for_shape so that F is fairly simple. For sake of simplicity weights will be a (xsize by size) matrix but we can let F work for flat inputs too:

def F(x, xsize=None, ysize=None):
    if len(x.shape) == 2:
        # based on how you have put together your question this seems like the most reasonable representation.
        weights = get_weights_for_shape(*x.shape)
        return x * weights
    elif len(x.shape) == 1 and xsize * ysize == x.shape[0]:
        # single dimensional input with explicit size, use flattened weights.
        weights = get_weights_for_shape(xsize, ysize)
        return x * weights.flatten()
    else:
        raise TypeError("must take 2D input or 1d input with valid xsize and ysize")


# note that get_one_weight=function can be replaced with your actual function.
def get_weights_for_shape(xsize, ysize, get_one_weight=function):
    """returns weights matrix for F for given input shape"""
    # will use (xsize, ysize) shape for these calculations.
    weights = np.zeros((xsize,ysize))
    #TODO: will fill in calculations here
    return weights

So first we want to run your function (which I have aliased get_one_weight inside the parameters) for each element, you have said that this function can't be vectorized so we can just use list comprehension. We want a matrix a that has the same shape (xsize,ysize) so the comprehension is a bit backwards for a nested list:

# notice that the nested list makes the loops in opposite order:
# [ROW for i in Xs]
#  ROW = [f() for j in Ys]
a = np.array([[get_one_weight(i,j,xsize,ysize)
                    for j in range(ysize)
              ] for i in range(xsize)])

With this matrix a > xsize will give a boolean array for conditional assignment:

case1 = a > xsize
weights[case1] = a[case1]

For the other case, we use the index i and j. To vectorize a 2D index we can use np.meshgrid

[i,j] = np.meshgrid(range(xsize), range(ysize), indexing='ij')
case2 = ~case1 # could have other cases, in this case it's just the rest.
weights[case2] = i[case2] * j[case2]

return weights #that covers all the calculations

Putting it all together gives this as the fully vectorized function:

# note that get_one_weight=function can be replaced with your actual function.
def get_weights_for_shape(xsize, ysize, get_one_weight=function):
    """returns weights matrix for F for given input shape"""
    # will use (xsize, ysize) shape for these calculations.
    weights = np.zeros((xsize,ysize))

    # notice that the nested list makes the loop order confusing:
    # [ROW for i in Xs]
    #  ROW = [f() for j in Ys]
    a = np.array([[get_one_weight(i,j,xsize,ysize)
                        for j in range(ysize)
                  ] for i in range(xsize)])

    case1 = (a > xsize)
    weights[case1] = a[case1]

    # meshgrid lets us use indices i and j as vectorized matrices.
    [i,j] = np.meshgrid(range(xsize), range(ysize), indexing='ij')
    case2 = ~case1
    weights[case2] = i[case2] * j[case2]
    #could have more than 2 cases if applicable.

    return weights

And that covers most of it. For your specific case since this heavy calculation only relies on the shape of the input, if you expect to call this function repeatedly with similarly sized input you can cache all previously calculated weights:

def get_weights_for_shape(xsize, ysize, _cached_weights={}):
    if (xsize, ysize) not in _cached_weights:
        #assume we added an underscore to real function written above
        _cached_weights[xsize,ysize] = _get_weights_for_shape(xsize, ysize)
    return _cached_weights[xsize,ysize]

As far as I can tell this seems like the most optimized you are going to get. The only improvements would be to vectorize function (even if that means just calling it in several threads in parallel) or possibly if .flatten() makes an expensive copy that could be improved but I'm not totally sure how.

I cannot do this operation: `(xsize,ysize) = dims`. It gives me an error : not enough values to unpack (expected 2, got 1) — AA10, Dec 12 '19 at 00:39
you didn't show how you are getting `xsize` and `ysize` so I assumed it was by taking it directly from the `x.shape`, if your data is only 1 dimensional then is `ysize` = 1 ? — Tadhg McDonald-Jensen, Dec 12 '19 at 00:48
My explanation was poor. My vector x is always one dimensional. Its size will depend on xsize and ysize that are just two integers. If ysize = 2 and xsize =3 then the size of the one dimensional array will be ysize*xsize. The problem here is that for x[0] we have (i=0, j=0) x[1] (i=0, j=1). This is the reason why I have a nested loop. If I could get the `i` and `j` from c in a direct way, I wouldnt need the loop. — AA10, Dec 12 '19 at 00:53
as written you can just pass `F(x.reshape((xsize, ysize))` to get what you want. I want to pull attention to where I'm using `np.meshgrid` since that gives me a vectorized indices to use in calculations, although it works a lot better when your input matches the shape of the loops. — Tadhg McDonald-Jensen, Dec 12 '19 at 01:00
It works. It is slower than first code but I think it is scaling in a better way. For a certain size it will get faster than the first code I guess. But can you explain to me what do you mean by reuse? — AA10, Dec 12 '19 at 01:49
hey @AA10, I've completely restructured the answer after realizing there was more content than just "caching is good!". You were asking about vectorizing so I'm now demonstrating how to vectorize the parts that can be vectorized. I'm not sure how to measure the point where making a bunch of numpy arrays gets more efficient than a python loop. — Tadhg McDonald-Jensen, Dec 12 '19 at 02:40

Vectorizing nested loop with conditionals and functions

1 Answers1

Linked