How to deal with "divide by zero" with pandas dataframes when manipulating columns?

Question

I'm working with hundreds of pandas dataframes. A typical dataframe is as follows:

import pandas as pd
import numpy as np
data = 'filename.csv'
df = pd.DataFrame(data)
df 

        one       two     three  four   five
a  0.469112 -0.282863 -1.509059  bar   True
b  0.932424  1.224234  7.823421  bar  False
c -1.135632  1.212112 -0.173215  bar  False
d  0.232424  2.342112  0.982342  unbar True
e  0.119209 -1.044236 -0.861849  bar   True
f -2.104569 -0.494929  1.071804  bar  False
....

There are certain operations whereby I'm dividing between columns values, e.g.

df['one']/df['two']

However, there are times where I am dividing by zero, or perhaps both

df['one'] = 0
df['two'] = 0

Naturally, this outputs the error:

ZeroDivisionError: division by zero

I would prefer for 0/0 to actually mean "there's nothing here", as this is often what such a zero means in a dataframe.

(a) How would I code this to mean "divide by zero" is 0 ?

(b) How would I code this to "pass" if divide by zero is encountered?

I cannot write an answer because this has been marked as duplicate (?) but an option if you only have one column with zeros is to do this: `1 / (df.ColumnWithZeros / df.ColumnWithoutZeros)` which is mathematically equivalent. Just like 1 / (2 / 3) is equivalent to 3 / 2 — Connor, Jul 17 '19 at 05:23
Related: [Handling division by zero in Pandas calculations](https://stackoverflow.com/q/45540015/55075) — kenorb, Oct 18 '20 at 18:30

Alexander · Answer 1 · 2016-08-11T07:39:38.697

It would probably be more useful to use a dataframe that actually has zero in the denominator (see the last row of column two).

        one       two     three   four   five
a  0.469112 -0.282863 -1.509059    bar   True
b  0.932424  1.224234  7.823421    bar  False
c -1.135632  1.212112 -0.173215    bar  False
d  0.232424  2.342112  0.982342  unbar   True
e  0.119209 -1.044236 -0.861849    bar   True
f -2.104569  0.000000  1.071804    bar  False

>>> df.one / df.two
a   -1.658442
b    0.761639
c   -0.936904
d    0.099237
e   -0.114159
f        -inf  # <<< Note division by zero
dtype: float64

When one of the values is zero, you should get inf or -inf in the result. One way to convert these values is as follows:

df['result'] = df.one.div(df.two)

df.loc[~np.isfinite(df['result']), 'result'] = np.nan  # Or = 0 per part a) of question.
# or df.loc[np.isinf(df['result']), ...

>>> df
        one       two     three   four   five    result
a  0.469112 -0.282863 -1.509059    bar   True -1.658442
b  0.932424  1.224234  7.823421    bar  False  0.761639
c -1.135632  1.212112 -0.173215    bar  False -0.936904
d  0.232424  2.342112  0.982342  unbar   True  0.099237
e  0.119209 -1.044236 -0.861849    bar   True -0.114159
f -2.104569  0.000000  1.071804    bar  False       NaN

Thank you for the explanation; the 'NaN' imputation is quite useful. I will improve my questions in the future — ShanZhengYang, Aug 11 '16 at 05:31
I don't think this addresses the question, which is about ZeroDivisionError. I don't get "inf" when I divide by zero. I get ZeroDivisionError. — Drew Nutter, Jun 22 '20 at 17:04
Small clarification: as long as the datatype of the two columns is `float`, this will work. For int it will not. Hence, if you encounter the issue mentioned by Drew, just cast your columns to float before the division. — ira, Mar 30 '23 at 11:23

score 11 · Answer 2 · edited Nov 19 '18 at 02:23

11

df['one'].divide(df['two'])

Code:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.rand(5,2), columns=list('ab'))
df.loc[[1,3], 'b'] = 0
print(df)

print(df['a'].divide(df['b']))

Result:

    a           b
0   0.517925    0.305973
1   0.900899    0.000000
2   0.414219    0.781512
3   0.516072    0.000000
4   0.841636    0.166157

0    1.692717
1         inf
2    0.530023
3         inf
4    5.065297
dtype: float64

edited Nov 19 '18 at 02:23

user229044

232,980
40
330
338

answered Aug 11 '16 at 03:19

Kartik

8,347
39
73

26

This answer doesn't seems to answer the question. The solution is to use: `df['one'].div(df['two']).replace(np.inf, 0)`. – kenorb Oct 18 '20 at 13:50

score -1 · Answer 3 · answered Aug 11 '16 at 04:24

-1

You can always use a try statement:

try:
  z = var1/var2
except ZeroDivisionError:
  print ("0") #As python-3's rule is: Parentheses

OR...

You can also do:

if var1==0:
    if var2==0:
        print("0")
else:
    var3 = var1/var2

Hope this helped! Choose whichever choice you desire (they're both the same anyways).

answered Aug 11 '16 at 04:24

Christian

132
1
14

`if ((var1==0) && (var2==0)):` – Victor Mar 24 '20 at 04:16

score -4 · Accepted Answer · answered Aug 11 '16 at 03:07

-4

Two approaches to consider:

Prepare your data so that never has a divide by zero situation, by explicitly coding a "no data" value and testing for that.

Wrap each division that might result in an error with a try/except pair, as described at https://wiki.python.org/moin/HandlingExceptions (which has a divide by zero example to use)

(x,y) = (5,0)
try:
  z = x/y
except ZeroDivisionError:
  print "divide by zero"

I worry about the situation where your data includes a zero that's really a zero (and not a missing value).

answered Aug 11 '16 at 03:07

vielmetti

1,864
16
23

20

Pandas (or NumPy) does not raise ZeroDivisionError. – ayhan Aug 11 '16 at 07:10
6

@ayhan I'm getting ZeroDivisionError from using the pandas `div` function. File "processing.py", line 50, in fun || df['pct'] = df['diffs', '2019-11-13'].divide(df['shares_latest']) || File "pandas/core/ops/__init__.py", line 570, in flex_wrapper || return self._binop(other, op, level=level, fill_value=fill_value) || File "pandas/core/series.py", line 2618, in _binop || result = func(this_vals, other_vals) || ZeroDivisionError: float division by zero – Drew Nutter Jun 22 '20 at 17:08

How to deal with "divide by zero" with pandas dataframes when manipulating columns?

4 Answers4

Linked