1

I have the following dataframe for business days

In [23]: d = pd.DataFrame({'date' : ['20070105', '20070105', '20070106', '20070106', '20070106', '20070109'], 's' : [1, 2, 1,2,3,1], 'i': ['a', 'b', 'a', 'b', 'c', 'a']})
In [26]: d['date'] = pd.to_datetime(d['date'], format='%Y%m%d')
In [27]: d
Out[27]: 
        date  i  s
0 2007-01-05  a  1
1 2007-01-05  b  2
2 2007-01-06  a  1
3 2007-01-06  b  2
4 2007-01-06  c  3
5 2007-01-09  a  1

I want to fill in the data for missing dates (according to 'alldays' calendar) and the output should be as follows. Basically 20070107 and 20070108 were missing and its data was copied from 20070106.

Out[31]: 
         date  i  s
0  2007-01-05  a  1
1  2007-01-05  b  2
2  2007-01-06  a  1
3  2007-01-06  b  2
4  2007-01-06  c  3
5  2007-01-07  a  1
6  2007-01-07  b  2
7  2007-01-07  c  3
8  2007-01-08  a  1
9  2007-01-08  b  2
10 2007-01-08  c  3
11 2007-01-09  a  1

What is the best way to do this in pandas?

Man
  • 51
  • 4
  • Please read [how to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit your post correspondingly. Use dates instead of strings in your input and output data sets – MaxU - stand with Ukraine Nov 18 '17 at 12:11
  • Thanks. Added reproducible code. – Man Nov 19 '17 at 10:57

1 Answers1

2

Use:

  • set_index by cumcount for unique MultiIndex
  • reshape by unstack for unique DatetimeIndex
  • reindex by minimal and maximal values in index, add missing values by method='ffill'
  • reshape back by stack
  • last remove second level of MultiIndex and convert Datetimeindex to column

df = (
      d.set_index(['date', d.groupby('date').cumcount()])
        .unstack()
        .reindex(pd.date_range(d['date'].min(), d['date'].max()), method='ffill')
        .stack()
        .reset_index(level=1, drop=True)
        .rename_axis('date')
        .reset_index()
       )
print (df)

         date  i    s
0  2007-01-05  a  1.0
1  2007-01-05  b  2.0
2  2007-01-06  a  1.0
3  2007-01-06  b  2.0
4  2007-01-06  c  3.0
5  2007-01-07  a  1.0
6  2007-01-07  b  2.0
7  2007-01-07  c  3.0
8  2007-01-08  a  1.0
9  2007-01-08  b  2.0
10 2007-01-08  c  3.0
11 2007-01-09  a  1.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252