how to avoid division by zero in 2d numpy array when taking average?

Question

Let's say I have three arrays

A = np.array([[2,2,2],[1,0,0],[1,2,1]])
B = np.array([[2,0,2],[0,1,0],[1,2,1]])
C = np.array([[2,0,1],[0,1,0],[1,1,2]])
A,B,C
(array([[2, 2, 2],
        [1, 0, 0],
        [1, 2, 1]]),
 array([[2, 0, 2],
        [0, 1, 0],
        [1, 2, 1]]),
 array([[2, 0, 1],
        [0, 1, 0],
        [1, 1, 2]]))

when i take average of C/ (A+B), i get nan/inf value with RunTimeWarning.. the resultant array looks like the following.

np.average(C/(A+B), axis = 1)
array([0.25      ,        nan, 0.58333333])

I would like to change any inf/nan value to 0.

What I tried so far was

#doesn't work. ( maybe im doing this wrong..)
mask = A+B >0
np.average(C[mask]/(A[mask]+B[mask]), axis = 1)


#does not work and not an ideal solution.
avg = np.average(C/(A+B), axis = 1)
avg[avg == np.nan] =0

any help would be appreciated!

This should help: https://stackoverflow.com/questions/29950557/ignore-divide-by-0-warning-in-numpy — Nick, Jun 14 '22 at 06:40

score 1 · Answer 1 · answered Jun 14 '22 at 06:47

1

import numpy as np

a = np.array([1, np.nan])
print(a) # [1, nan]
a = np.nan_to_num(a)
print(a) # [1, 0]

https://numpy.org/doc/stable/reference/generated/numpy.nan_to_num.html

for inf and -inf

from numpy import inf
avg[avg == inf] = 0
avg[avg == -inf] = 0

answered Jun 14 '22 at 06:47

User

806
1
11
28

how do we avoid error and division by 0 in first place ( maybe via using mask?) rather than changing nan to 0 or inf/-inf to 0? – Jun 14 '22 at 06:52
warnings, not errors. Just read the thread posted by @Nick on how to disable them. – User Jun 14 '22 at 06:56
Yes disable warnings and replacing to 0 helps but is there a way doing this using a mask or some otherway around? Im just trying to scratch my head to do this without using disable warnings and replace to 0.. – Jun 14 '22 at 07:04
You have `0`s in your input arrays so the only thing is to remove them before applying the `average`, but it's your input and you can't change it... You can try to implement the division of arrays and the average by yourself with some `if` to avoid division by `0`, but I can't see any benefit of it. Just disable warnings.. – User Jun 14 '22 at 07:18

score 1 · Accepted Answer · answered Jun 14 '22 at 08:09

Your tried approaches are both a valid way of dealing with it, but you need to change them slightly.

Avoiding the division upfront, by only calculating the result where it's valid (eg non-zero):

The use of the boolean mask you defined makes the resulting arrays (after indexing) to become 1D. So using this would mean you have to allocate the resulting array upfront, and assign it using that same mask.

mask = A+B > 0
result = np.zeros_like(A, dtype=np.float32)
result[mask] = C[mask]/(A[mask]+B[mask])

It does require the averaging over the second dimension to be done separate, and also masking the incorrect result to zero for elements where the division could not be done due to the zeros.

result = result.mean(axis=1)
result[(~mask).any(axis=1)] = 0

To me the main benefit would be avoiding the warning from Numpy, and perhaps in the case of a large amount of zeros (in A+B) you could gain a little performance by avoiding that calculation all together. But overall it seems a lot of effort to me.

Masking invalid values afterwards:

The main takeaway here is that you should never ever compare against np.nan directly since it will always be False. You can check this yourself by looking at the result from np.nan == np.nan. The way to handle this is use the dedicated np.isnan function. Or alternatively negate the np.isfinite function if you also want to catch +/- np.inf values at the same time.

avg = np.average(C/(A+B), axis = 1)
avg[np.isnan(avg)] = 0

# or to include inf
avg[~np.isfinite(avg)] = 0

Michael Szczesny · Answer 3 · 2022-06-14T08:55:42.290

This is tougher than I thought as np.mean's where argument doesn't work if it results in empty arrays and np.average's weights have to be 1-D.

# these don't work
# >>> np.mean(div, axis=1, where=mask.all(1, keepdims=True))
# RuntimeWarning: Mean of empty slice.
# RuntimeWarning: invalid value encountered in true_divide

# >>> np.average(div, axis=1, weights=mask.all(1, keepdims=True))
# TypeError: 1D weights expected when shapes of a and weights differ.

import numpy as np

A = np.array([[2,2,2],[1,0,0],[1,2,1]])
B = np.array([[2,0,2],[0,1,0],[1,2,1]])
C = np.array([[2,0,1],[0,1,0],[1,1,2]])

div = np.zeros(C.shape)
AB = A+B                                # avoid repeated summing
mask = AB > 0                           # AB != 0 to include all valid divisors
np.divide(C, AB, where=mask, out=div)   # out=None won't initialize unused elements
np.mean(div * mask.all(1, keepdims=True), axis = 1)

Output

array([0.25      , 0.        , 0.58333333])

score 0 · Answer 4 · answered Jun 14 '22 at 08:14

0

Simply follow this if you are supposed to keep the inf value zero

np.divide(a, b, where=b.astype(bool))

answered Jun 14 '22 at 08:14

luoshao23

391
2
14

this use of `where` also requires an `out` parameter. – hpaulj Jun 14 '22 at 11:47

how to avoid division by zero in 2d numpy array when taking average?

4 Answers4