I have the following dataframe for business days
In [23]: d = pd.DataFrame({'date' : ['20070105', '20070105', '20070106', '20070106', '20070106', '20070109'], 's' : [1, 2, 1,2,3,1], 'i': ['a', 'b', 'a', 'b', 'c', 'a']})
In [26]: d['date'] = pd.to_datetime(d['date'], format='%Y%m%d')
In [27]: d
Out[27]:
date i s
0 2007-01-05 a 1
1 2007-01-05 b 2
2 2007-01-06 a 1
3 2007-01-06 b 2
4 2007-01-06 c 3
5 2007-01-09 a 1
I want to fill in the data for missing dates (according to 'alldays' calendar) and the output should be as follows. Basically 20070107 and 20070108 were missing and its data was copied from 20070106.
Out[31]:
date i s
0 2007-01-05 a 1
1 2007-01-05 b 2
2 2007-01-06 a 1
3 2007-01-06 b 2
4 2007-01-06 c 3
5 2007-01-07 a 1
6 2007-01-07 b 2
7 2007-01-07 c 3
8 2007-01-08 a 1
9 2007-01-08 b 2
10 2007-01-08 c 3
11 2007-01-09 a 1
What is the best way to do this in pandas?