0

I have read this question and understand that Numpy arrays cannot be used in boolean context. Let's say I want to perform an element-wise boolean check on the validity of inputs to a function. Can I realize this behavior while still using Numpy vectorization, and if so, how? (and if not, why?)

In the following example, I compute a value from two inputs while checking that both inputs are valid (both must be greater than 0)

import math, numpy
def calculate(input_1, input_2):
    if input_1 < 0 or input_2 < 0:
        return 0
    return math.sqrt(input_1) + math.sqrt(input_2)
calculate_many = (lambda x: calculate(x, 20 - x))(np.arange(-20, 40))

By itself, this would not work with Numpy arrays because of ValueError. But, it is imperative that math.sqrt is never run on negative inputs because that would result in another error.

One solution using list comprehension is as follows:

calculate_many = [calculate(x, 20 - x) for x in np.arange(-20, 40)]/=

However, this no longer uses vectorization and would be painfully slow if the size of the arange was increased drastically. Is there a way to implement this if check while still using vectorization?

Dancing Bear
  • 55
  • 1
  • 8
  • 1
    `math.sqrt` only works with scalars, so your `calculate`, even without the `if`, does not work with the whole array - ie. there's no "vectorization". There is a `np.sqrt` that works with an whole array. It accepts a `where` parameter to control which values are evaluated (use it with the `out` parameter). – hpaulj Jun 29 '20 at 19:25
  • What does it mean for optional `where` to be an array-like? Does that mean I would write `np.sqrt(input_1, input_1 > 0)` if I wanted to square root all positive numbers in `input_1` and leave all other output entries as undefined? – Dancing Bear Jun 29 '20 at 19:48
  • ^the above does not work when run, but I don't know how to write it otherwise and documentation does not explain. – Dancing Bear Jun 29 '20 at 19:54

2 Answers2

2

I believe below expression performs vectorized operations and avoid the use of loops/lambda functions

np.sqrt(((input1>0) & 1)*input1) + np.sqrt(((input2>0) & 1)*input2)
sam
  • 2,263
  • 22
  • 34
  • I'm a student unfamiliar with industry. Is use of lambdas like the way I used them generally considered bad practice then? – Dancing Bear Jun 29 '20 at 19:51
  • Not a bad practice I would say. But it is not vectorized operation either. When you don't have long arrays, or in other non-numpy cases, this should not be a bad thing. In fact it's pythonic! – sam Jun 29 '20 at 20:24
1
In [121]: x = np.array([1, 10, 21, -1.])                                                
In [122]: y = 20-x                                                                      
In [123]: np.sqrt(x)                                                                    
/usr/local/bin/ipython3:1: RuntimeWarning: invalid value encountered in sqrt
  #!/usr/bin/python3
Out[123]: array([1.        , 3.16227766, 4.58257569,        nan])

There are several ways of dealing with 'out-of-range' values.

@Sam's approach is to tweak the inputs so they are valid

In [129]: ((x>0) & 1)*x                                                                 
Out[129]: array([ 1., 10., 21., -0.])

Another is to use masking to limit the values calculate.

Your function skips the sqrt is either input is negative; conversely it doe sthe calc where both are valid. That's different from testing each separately.

In [124]: mask = (x>=0) & (y>=0)                                                        
In [125]: mask                                                                          
Out[125]: array([ True,  True, False, False])

We can use the mask thus:

In [126]: res = np.zeros_like(x)                                                        
In [127]: res[mask] = np.sqrt(x[mask]) + np.sqrt(y[mask])                               
In [128]: res                                                                           
Out[128]: array([5.35889894, 6.32455532, 0.        , 0.        ])

In my comments I suggested using the where parameter of np.sqrt. It does, though, need an out parameter as well.

In [130]: np.sqrt(x, where=mask, out=np.zeros_like(x)) +
          np.sqrt(y, where=mask, out=np.zeros_like(x))                                                                   
Out[130]: array([5.35889894, 6.32455532, 0.        , 0.        ])

Alternatively if we are are happy with the nan in Out[123] we can just suppress the RuntimeWarning.

hpaulj
  • 221,503
  • 14
  • 230
  • 353