I have a table with lots of report gaps (missing report dates). I want to fix the data by adding the missing report dates onto each account (represented by ID).
For example, let's set the start date as 2014-07-01 and the end date as 2018-03-13. There are 970 report dates excluding the weekends in total.
The table looks like below...Let's call it "df". My original df is the data frame shown below without rows beginning with GAP. Rows starting with GAP are rows I want to add.
for step 1, I want to add all the missing report dates onto this table (rows starting with GAP below), and then for step 2, I want to fill in all the "balances" I added from step 1 with 0. And for step 3, I want to fill in other variables with the last value before the gap. And for the last step, I want to apply it to all IDs.
ID report_date balance other_v1 other_v2
DAY 1 1 7/1/2014 1500 0
DAY 2 1 7/2/2014 1500 0.1
GAP 1 (DAY 3) 1 7/3/2014 0 0.1
GAP 2 (DAY 4) 1 ... 0 0.1
... 1 ... 0... 0.1..
GAP 30 (DAY 32) 1 8/10/2014 0 0.1
DAY 33 1 8/11/2014 1500 0.3
DAY 34 1 8/12/2014 1600 0.3
DAY 35 1 8/13/2014 1600 0.25
GAP 31 (DAY 36) 1 8/14/2014 0 0.25
GAP 32 (DAY 37) 1 8/15/2014 0 0.25
... 1 ... 0 ...
DAY 970 (OR GAP N) 1 3/13/2018 0 ...