0

I am running the following code:

df['diff']=np.where(df['fc']!=0,(df['ar']-df['fc'])/df['fc'],0)

It is returning a ZeroDivisionError. I'm not sure how that is happening when I am specifying to only run that formula if the denominator does not equal 0. If I run the below it works, but I don't want to cut out the rows where this is true:

df2=df[df['fc']!=0]
df2['diff']=(df['ar']-df['fc'])/df['fc']

edit to "specify clear goal": The desired output here would return 0 where the denominator is 0 and return the difference % where the denominator > 0.

Thanks you for the answer! works perfectly.

sokeefe1014
  • 227
  • 1
  • 3
  • 9
  • 3
    `(df['ar']-df['fc'])/df['fc']` performs the division for all elements, zero or not. – user2357112 Sep 12 '17 at 21:34
  • hmm.. I thought np.where worked as an if then statement so that it was only performing this is the qualifier were true? If not then what is the point of the np.where statement? It performs this, but only keeps the value if the qualifier is true? – sokeefe1014 Sep 12 '17 at 21:36
  • 3
    "I'm not sure how that is happening when I am specifying to only run that formula if the denominator does not equal 0" No, that isn't what you are doing. `np.where` doesn't magically change the way Python works, as all expressions are evaluated to their value before being passed as an argument. `np.where` returns the values *where the boolean array that you pass it is truthy*. You don't pass it an expression, you *pass it a value*. – juanpa.arrivillaga Sep 12 '17 at 21:37
  • 1
    @sokeefe1014, i can't reproduce that using your `np.where(...)` code. Can you provide a small sample reproducible data set? – MaxU - stand with Ukraine Sep 12 '17 at 21:37
  • got it. Can one of you suggest a more appropriate way to handle this? – sokeefe1014 Sep 12 '17 at 21:39
  • 1
    Here are two useful questions: [one](https://stackoverflow.com/questions/25087769/runtimewarning-divide-by-zero-error-how-to-avoid-python-numpy), [two](https://stackoverflow.com/questions/26248654/numpy-return-0-with-divide-by-zero) but if I am not mistaken, in the recent versions it doesn't raise an error but instead a warning and returns `nan` or `inf`. – ayhan Sep 12 '17 at 21:39
  • 1
    "It performs this, but only keeps the value if the qualifier is true?" - exactly. `numpy.where` is not conditional execution; it is conditional *selection*, selecting results from arrays that already exist, based on a condition array. – user2357112 Sep 12 '17 at 21:39
  • As maxU & ayhan note, your code ought to work by returning NaN values instead of an error. Note sure if it could be data or if you have an old version of pandas. This might be a workaround: `df['diff']=(df2['ar']-df2['fc'])/df2['fc']` (It's actually your code, I just reversed df & df2. It leaves NaN but you can then use fillna(0) to fix.) – JohnE Sep 12 '17 at 22:15

1 Answers1

1

Well, you have a little misunderstanding of np.where(condition, [x, y]) function. When you make the call, the input parameter x, and y will first be evaluated. In your case, x=(df['ar']-df['fc'])/df['fc'], so that you will enconter ZeroDivisionError if df['fc'] contains any zero. I like the comment from @user2357112, np.where performs conditional selection, from x if True and from y if False for each element.

If you want to keep all elements even if zeros of df['fc'], you can first set these elements to np.nan. After the computation, you could handle these NAN values, e.g., set them to zero.

Here is the pseudo code:

df.loc[df.fc == 0, 'fc'] = np.nan
df['diff'] = ((df.ar-df.fc)/df.fc).fillna(0)

And here is the test of dividing np.nan:

In [1]: a = np.array([1.0, np.nan])

In [2]: 2/a
Out[2]: array([  2.,  nan])

Thanks.

rojeeer
  • 1,991
  • 1
  • 11
  • 13