1

I have a dataframe with a date column where some of the dates are missing. I would like to fill these with the day after the date from the previous row.

  date        
0 2015-01-31  
1 NaT        
2 2015-02-02  

So for example in the data frame above, row 1 would be imputed with 2015-02-01. I would like to achieve this without reindexing!

rfkortekaas
  • 6,049
  • 2
  • 27
  • 34

1 Answers1

1

Using a global variable with apply helps ur cause:

import pandas as pd
import numpy as np
df = pd.DataFrame({"d":["2015-01-31", np.nan, "2015-02-02", np.nan, np.nan, "2015-02-02", np.nan]})

df["d"] = pd.to_datetime(df["d"])

print(df)

#df

           d
0 2015-01-31
1        NaT
2 2015-02-02
3        NaT
4        NaT
5 2015-02-02
6        NaT

value = np.nan
def impute_date_rowise(x):
    global value
    if pd.isnull(value):
        value = x
        return value
    elif not pd.isnull(x):
        value = x
        return x
    else:
        value = (value + pd.DateOffset(days=1)).date()
        return value

df["d"].apply(impute_date_rowise)

0   2015-01-31
1   2015-02-01
2   2015-02-02
3   2015-02-03
4   2015-02-04
5   2015-02-02
6   2015-02-03
Narahari B M
  • 346
  • 1
  • 16