1

Consider a Series with a MultiIndex that provides a natural grouping value on level 0 and time series on level 1:

s = pd.Series(range(12), index=pd.MultiIndex.from_product([['a','b','c'],
              pd.date_range(start='2019-01-01', freq='3D', periods=4)], names=['grp','ts']))
print(s)
grp  ts
a    2019-01-01     0
     2019-01-04     1
     2019-01-07     2
     2019-01-10     3
b    2019-01-01     4
     2019-01-04     5
     2019-01-07     6
     2019-01-10     7
c    2019-01-01     8
     2019-01-04     9
     2019-01-07    10
     2019-01-10    11
Length: 12, dtype: int64

I want to upsample the time series for each outer index value, say with a simple forward fill action:

s.groupby(['grp', pd.Grouper(level=1, freq='D')]).ffill()

Which produces unexpected results; namely, it doesn't do anything. The result is exactly s rather than what I desire which would be:

grp ts
a   2019-01-01   0
    2019-01-02   0
    2019-01-03   0
    2019-01-04   1
    2019-01-05   1
    2019-01-06   1
    2019-01-07   2
    2019-01-08   2
    2019-01-09   2
    2019-01-10   3
b   2019-01-01   4
    2019-01-02   4
    2019-01-03   4
    2019-01-04   5
    2019-01-05   5
    2019-01-06   5
    2019-01-07   6
    2019-01-08   6
    2019-01-09   6
    2019-01-10   7
c   2019-01-01   8
    2019-01-02   8
    2019-01-03   8
    2019-01-04   9
    2019-01-05   9
    2019-01-06   9
    2019-01-07  10
    2019-01-08  10
    2019-01-09  10
    2019-01-10  11
Length: 30, dtype: int64

I can change the Grouper freq or the resample function to same effect. The one workaround I found was through creative trickery to force a simple time series index on each group (thank you Allen for providing the answer https://stackoverflow.com/a/44719843/3109201):

s.reset_index(level=1).groupby('grp').apply(lambda s: s.set_index('ts').resample('D').ffill())

which is slightly different from what I was originally asking for, because it returns a DataFrame:

                 0
grp ts
a   2019-01-01   0
    2019-01-02   0
    2019-01-03   0
    2019-01-04   1
    2019-01-05   1
    2019-01-06   1
    2019-01-07   2
    2019-01-08   2
    2019-01-09   2
    2019-01-10   3
b   2019-01-01   4
    2019-01-02   4
    2019-01-03   4
    2019-01-04   5
    2019-01-05   5
    2019-01-06   5
    2019-01-07   6
    2019-01-08   6
    2019-01-09   6
    2019-01-10   7
c   2019-01-01   8
    2019-01-02   8
    2019-01-03   8
    2019-01-04   9
    2019-01-05   9
    2019-01-06   9
    2019-01-07  10
    2019-01-08  10
    2019-01-09  10
    2019-01-10  11

[30 rows x 1 columns]

I can and will use this workaround, but I'd like to know why the simpler (and frankly more elegant) method is not working.

Ganymede
  • 11
  • 4
  • to be clear: I'm asking why the Grouper method doesn't work, not what I can do to get what I want without it. – Ganymede Sep 20 '19 at 20:18

1 Answers1

0

use series.asfreq() which fulfills the missing dates.

def filldates(s_in):
  s_in.reset_index(level="grp",drop=True,inplace=True)
  s_in= s_in.asfreq("1D",method='ffill')
  return s_in
s.groupby(level=0).apply(filldates)
Rean
  • 56
  • 3