0

I want to forward fill nan values in a column, by adding 1 to the previous non-nan value as you go down the column.

Here is a sample code to run and compare to:

from pandas import DataFrame
from numpy import nan

# This is the dataframe I'm looking for in the end
desired_df = DataFrame.from_dict({'col':[nan, nan, 0, 1, 2, 3, 4, 0, 1, 2, 0, 0, 1, 2, 3, 4, 5, 6, 0, 1]})

# This is the input dataframe
df = DataFrame.from_dict({'col':[nan, nan, 0, nan, nan, nan, nan, 0, nan, nan, 0, 0, nan, nan, nan, nan, nan, nan, 0, nan]})

####### Turn "df" into "desired_df" here #######

# Check if they match!
assert df.merge(desired_df).shape == df.shape, "You suck!"

IDEALLY WITHOUT A FOR LOOP, but it's not crucial.

wildcat89
  • 1,159
  • 16
  • 47
  • `assert df.merge(desired_df).shape == df.shape, "You suck!"` This assertion always fails, even if the two dataframes are identical. – Nick ODell Oct 09 '22 at 00:34

1 Answers1

1

I'd suggest something like this:

missing = df.isna()
missing_cumsum = missing.cumsum()
offset = missing_cumsum - missing_cumsum.where(~missing).ffill().fillna(0)
df = df.ffill() + offset

I based this on code from this answer.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
  • Thank you! Sorry for the wrong assert above, I copy/pasted it from an SO answer comparing 2 identical df's so assumed it would have worked, I didn't check it. Anyway, thanks!! Works like a charm! – wildcat89 Oct 09 '22 at 01:26