I have a pandas dataframe df
as
Date cost NC
20 5 NaN
21 7 NaN
23 9 78.0
25 6 80.0
Now what I need to do is fillup the missing dates and hence fill the column with a value say x
only if there is number in the previous row. That is I want the output like
Date cost NC
20 5 NaN
21 7 NaN
22 x NaN
23 9 78.0
24 x x
25 6 80.0
See Date 22 was missing and on 21 NC
was missing, So on 22 cost
is assigned to x but NC
is assigned to NaN
. Now setting the Date
column to index
and reindex
ing it to missing values I can get upto here
Date cost NC
20 5.0 NaN
21 7.0 NaN
22 NaN NaN
23 9.0 78.0
24 NaN NaN
25 6.0 80.0
But I cant get to the final output. If you think this way it is like ffill()
but instead of filling from previous row you have to put x
here.
I have another problem. here I have a dataframe df
like this
Date type cost
10 a 30
11 a 30
11 b 25
13 a 27
Here also I have to fill the missing value and make it like this
Date type cost
10 a 30
11 a 30
11 b 25
12 a 30
12 b 25
13 a 27
as you can see there was 2 data row for date 11 so both are copied to 12. I wrote this program for the problem
missing=[12]
for i in missing:
new_date=i
i-=1 #go to previous date
k=df[df["Date"] == i].index.tolist()[-1]+1 #index where to be filled
data=pd.DataFrame(df[df["Date"] == i].values,columns=df.columns)
data["Date"]=new_date
df=pd.concat([df.iloc[:k],data,df.iloc[k:]]).reset_index(drop=True)
Now for a large data set the above program takes a lot of time as has to find index and concat 3 dataframe each time. Is there any better and efficient way to solve this problem?