13

I am running in to RuntimeWarning: Invalid value encountered in divide

 import numpy
 a = numpy.random.rand((1000000, 100))
 b = numpy.random.rand((1,100))
 dots = numpy.dot(b,a.T)/numpy.dot(b,b)
 norms = numpy.linalg.norm(a, axis =1)
 angles = dots/norms ### Basically I am calculating angle between 2 vectors 

There are some vectors in my a which have norm as 0. so while calculating angles it is giving runtime warning.

Is there a one line pythonic way to compute angles while taking into account norms which are 0?

angles =[i/j if j!=0 else -2 for i,j in zip(dots, norms)] # takes 10.6 seconds

But it takes a lot of time. Since all angles will have values between 1 and -1 and I need only 10 max values this will help me. This takes around 10.6 seconds which is insane.

pg2455
  • 5,039
  • 14
  • 51
  • 78
  • can you filter norms before passing it? – Padraic Cunningham Aug 01 '14 at 20:06
  • No I want to preserve the indices because I want indices of 10 largest angles in the end which are further matched in other list. So preserving indices is must for me. – pg2455 Aug 01 '14 at 20:09
  • With the code you posted, the `angles` variable will have NaNs where `norms` is zero. Isn't that what you want? Note that if you want `None` instead of `NaN`, then you'll need to make `angles` of dtype `object`, which is bad for performance... (Also, it is possible to just suppress RuntimeWarnings. Is that what you want?) – unutbu Aug 01 '14 at 20:16
  • I want specific value at place where NaN is encountered. So that I can manipulate it while sorting. – pg2455 Aug 01 '14 at 20:24
  • nans sort higher than anything else so when you are sorting you can just strip the top np.count_nonzero(np.isnan(d.ravel()) values – jtaylor Aug 01 '14 at 20:28
  • How is your code sorting it? – pg2455 Aug 01 '14 at 20:32
  • "one line pythonic way": write a little function, e.g. `div0` in [numpy-return-0-with-divide-by-zero/](http://stackoverflow.com/questions/26248654/numpy-return-0-with-divide-by-zero/35696047#35696047) – denis Jul 06 '16 at 16:21

5 Answers5

17

you can ignore warings with the np.errstate context manager and later replace nans with what you want:

import numpy as np
angle = np.arange(-5., 5.) 
norm = np.arange(10.)
with np.errstate(divide='ignore'):
    print np.where(norm != 0., angle / norm, -2)
# or:
with np.errstate(divide='ignore'):
    res = angle/norm
res[np.isnan(res)] = -2
jtaylor
  • 2,389
  • 19
  • 19
5

In newer versions of numpy there is a third alternative option that avoids needing to use the errstate context manager.

All Numpy ufuncs accept an optional "where" argument. This acts slightly differently than the np.where function, in that it only evaluates the function "where" the mask is true. When the mask is False, it doesn't change the value, so using the "out" argument allows us to preallocate any default we want.

import numpy as np

angle = np.arange(-5., 5.)
norm = np.arange(10.)

# version 1
with np.errstate(divide='ignore'):
    res1 = np.where(norm != 0., angle / norm, -2)

# version 2
with np.errstate(divide='ignore'):
    res2 = angle/norm
res2[np.isinf(res2)] = -2

# version 3
res3 = -2. * np.ones(angle.shape)
np.divide(angle, norm, out=res3, where=norm != 0)

print(res1)
print(res2)
print(res3)

np.testing.assert_array_almost_equal(res1, res2)
np.testing.assert_array_almost_equal(res1, res3)
DStauffman
  • 3,960
  • 2
  • 21
  • 30
3

You could use angles[~np.isfinite(angles)] = ... to replace nan values with some other value.

For example:

In [103]: angles = dots/norms

In [104]: angles
Out[104]: array([[ nan,  nan,  nan, ...,  nan,  nan,  nan]])

In [105]: angles[~np.isfinite(angles)] = -2

In [106]: angles
Out[106]: array([[-2., -2., -2., ..., -2., -2., -2.]])

Note that division by zero may result in infs, rather than nans,

In [140]: np.array([1, 2, 3, 4, 0])/np.array([1, 2, 0, -0., 0])
Out[140]: array([  1.,   1.,  inf, -inf,  nan])

so it is better to call np.isfinite rather than np.isnan to identify the places where there was division by zero.

In [141]: np.isfinite(np.array([1, 2, 3, 4, 0])/np.array([1, 2, 0, -0., 0]))
Out[141]: array([ True,  True, False, False, False], dtype=bool)

Note that if you only want the top ten values from an NumPy array, using the np.argpartition function may be quicker than fully sorting the entire array, especially for large arrays:

In [110]: N = 3

In [111]: x = np.array([50, 40, 30, 20, 10, 0, 100, 90, 80, 70, 60])

In [112]: idx = np.argpartition(-x, N)

In [113]: idx
Out[113]: array([ 6,  7,  8,  9, 10,  0,  1,  4,  3,  2,  5])

In [114]: x[idx[:N]]
Out[114]: array([100,  90,  80])

This shows np.argpartition is quicker for even only moderately large arrays:

In [123]: x = np.array([50, 40, 30, 20, 10, 0, 100, 90, 80, 70, 60]*1000)

In [124]: %timeit np.sort(x)[-N:]
1000 loops, best of 3: 233 µs per loop

In [125]: %timeit idx = np.argpartition(-x, N); x[idx[:N]]
10000 loops, best of 3: 53.3 µs per loop
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
2

You want to be using np.where. See the documentation.

angles = np.where(norms != 0, dots/norms, -2)

Angles will consist of downs/norms whenever norms != 0, and will be -2 otherwise. You will still get the RuntimeWarning, as np.where will still calculate the entire vector dots/norms internally, but you can safely ignore it.

Roger Fan
  • 4,945
  • 31
  • 38
  • Somehow it didn't give me any warning!! – pg2455 Aug 01 '14 at 20:41
  • 1
    If you're running python interactively (e.g. in ipython) then I believe it will only give you the warning once per session. You can also always add `np.errstate(divide='ignore')` as jtaylor suggested to suppress the warning in the future. – Roger Fan Aug 01 '14 at 20:44
0

You can use np.where( condition ) to perform a conditional slice of where norms does not equal 0 before dividing:

norms = np.where(norms != 0 ) 
angles = dots/norms
agconti
  • 17,780
  • 15
  • 80
  • 114
  • norms is a vector and not number. I want to do element wise division of vectors with puting None where division is not possible – pg2455 Aug 01 '14 at 20:04