df = pd.DataFrame({"ID":[1,1,1,2,2,2],
"Date":["29.12.2020","05.01.2021","15.02.2021","11.04.2021","27.05.2021","29.05.2021"],
"Amount":[6,5,7,9,8,7]})
df["Date"] = pd.to_datetime(df["Date"])
Data:
ID | Date | Amount |
---|---|---|
1 | 29.12.2020 | 6 |
1 | 05.01.2021 | 5 |
1 | 15.02.2021 | 7 |
2 | 11.04.2021 | 9 |
2 | 27.05.2021 | 8 |
2 | 29.05.2021 | 7 |
Desired output:
ID | Date | Amount |
---|---|---|
1 | 01.12.2020 | 0 |
. | . | . |
. | . | . |
1 | 28.12.2020 | 0 |
1 | 29.12.2020 | 6 |
. | . | . |
. | . | . |
1 | 04.01.2020 | 6 |
1 | 05.01.2021 | 5 |
. | . | . |
. | . | . |
1 | 14.02.2021 | 5 |
1 | 15.02.2021 | 7 |
. | . | . |
. | . | . |
1 | 28.02.2021 | 7 |
2 | 01.04.2021 | 0 |
. | . | . |
. | . | . |
2 | 10.04.2021 | 0 |
2 | 11.04.2021 | 9 |
. | . | . |
. | . | . |
2 | 26.05.2021 | 9 |
2 | 27.05.2021 | 8 |
2 | 28.05.2021 | 8 |
2 | 29.05.2021 | 7 |
So, basically, I want to expand the data set for the missing dates and insert the latest of preceding amount value for every newly added date.
The date ranges of the ID groups do vary.
Plus, fill value should be 0 for the unobserved values in the month associated with the first observation (e.g. Amount column takes the value of 0 between 01.12.2020 and 28.12.2020, since the first observation of twelfth month of 2020 is on 29th day).