Catching ZeroDivisionError Yet Skips All Iterations in a Dataframe Column

Question

In order to avoid dividing by zero, the following logic is aimed to calculate a percentage change and skips the values where the prior month has 0 in values. However, the following would yield all percentage change to null even though there are valid non-NaN non-zero numbers in the referred columns.

df_ex['metric_diff'] = (df_ex['metric_lastmonth'] - df_ex['metric_thismonth'])
try:
    df_ex['metric_pctdiff'] = (df_ex['metric_lastmonth'] - df_ex['metric_thismonth'])/df_ex['metric_thismonth']
except ZeroDivisionError:
    df_ex['metric_pctdiff'] = np.nan

print(len(df_ex[df_ex['metric_diff'].notna()]))

521

print(len(df_ex[df_ex['metric_pctdiff'].notna()]))

0

The outputs indicate that there are nominal difference with non-NaN values, yet the percentage difference comparison yields all NaN values when compared the same two columns. Is there a logic error I made? What went wrong here?

Additional details: here is a sample of the Dataframe.

and in this example, there are two rows where we have non-null values from last month.

and... the desired outputs should be like in the following:

I interpret your results the following way: (1) `df_ex['metric_diff']` contains roughly 500 items that are **not** a `NaN`. (2) `df_ex['metric_pctdiff']` contains **only** `NaN`s. Is that correct? If so, then it's a bit strange, indeed. Could you add a small sample, a so-called MRE, to the question that reproduces those results (see [here](https://stackoverflow.com/help/minimal-reproducible-example) and [here](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples))? — Timus, May 31 '23 at 11:06
`try`-`except` doesn't work row-wise, it's an either-or globally. Ie. if the `try`-part raises one `ZeroDivisionError`, then **all** the items in `df_ex['metric_pctdiff']` are set to `np.nan`. — Timus, May 31 '23 at 11:08
Thank you Timus. I did not know that try - except is either-or globally. A MRE looks like the following: df_ex[0:10][['Identifier', 'metric_thismonth', 'metric_lastmonth']] — Pythonlearner, Jun 01 '23 at 13:46
Identifier metric_thismonth metric_lastmonth ID_5AE9 NaN NaN ID_5AG4 NaN NaN ID_5AJ8 NaN NaN ID_JAA6 NaN NaN ID_XAD6 0 0 ID_VAC4 NaN NaN ID_VAE0 NaN NaN ID_VAG5 NaN NaN ID_VAJ9 NaN NaN ID_5AX9 4.21232 11.591 — Pythonlearner, Jun 01 '23 at 13:47
Please edit samples into the question, don't provide them in comments. Imho your sample acutally doesn't reproduce your problem: I get `1` for `len(df_ex[df_ex['metric_pctdiff'].notna()])`, which is what I would expect? — Timus, Jun 01 '23 at 14:06
Timus I ended up finding a solution from an earlier post https://stackoverflow.com/questions/69489664/zero-division-in-one-row-causing-error-in-all-other-rows-pandas/76383448#76383448. For some reason, using fillana(o) resolves the issue I was having — Pythonlearner, Jun 01 '23 at 15:56

Saxtheowl · Answer 1 · 2023-06-01T16:48:42.610

Pandas will return inf when you divide by zero, and NaN when you divide zero by zero, that's why your ZeroDivisionError is not being caught and you got a return value NaN.

We could handle the case you want manually with the code below, it will calculate the percentage difference for non-zero 'metric_thismonth' and place NaN for zero 'metric_thismonth'

edit: lets try with a lambda function inside the apply() function

import pandas as pd
import numpy as np

df_ex['metric_diff'] = df_ex['metric_lastmonth'] - df_ex['metric_thismonth']

df_ex['metric_pctdiff'] = df_ex.apply(lambda row: (row['metric_diff'] / row['metric_lastmonth']) 
                                       if row['metric_lastmonth'] != 0 
                                       else np.nan, axis=1)

Thanks Saxtheowl. I tried the following with the requirement of calculation on non-null items but it still gave me an ZeroDivisionError: float division by zero error. df_ex['metric_pctdiff'] = np.where((df_ex['metric_lastmonth'].notnull() & df_ex['metric_diff'] != 0), df_ex['DLQ_60+_diff']/df_ex['metric_lastmonth'], np.nan) — Pythonlearner, Jun 01 '23 at 13:56

score 0 · Answer 2 · edited Jun 01 '23 at 22:26

0

Using .fillna(0) on the both columns resolves the problem (as described on the solution to another question):

df_ex['metric_pctdiff'] = df_ex['metric_diff'].fillna(0) / df_ex['metric_lastmonth'].fillna(0)

edited Jun 01 '23 at 22:26

Kyle F Hartzenberg

2,567
3
6
24

answered Jun 01 '23 at 15:55

Pythonlearner

11
3

Catching ZeroDivisionError Yet Skips All Iterations in a Dataframe Column

521

0

2 Answers2

Linked