1

Related to Fun with Pandas `rolling_apply` and TypeError and Using rolling_apply on a DataFrame object

Suppose F is a function that returns a function and df is a Series with a datetime index.

The following seems to work:

df.groupby(pandas.TimeGrouper('10d')).apply(F)

while the following results in a TypeError as rolling_apply seems to expect floats as return values.

pandas.rolling_apply(df, 10, F)

I actually want the latter as I want to groupby each window of 10 available days and not the available data from each 10-day window as I think TimeGrouper does.

Is there a direct way to do this using some other function?

For a concrete example, this appears to be working:

from statsmodels.tools.tools import ECDF
s = pandas.Series(randn(1000))
s.index = pandas.date_range('2012-01-01', periods=s.shape[0], freq='D')
f = s.groupby(pandas.TimeGrouper('30D')).apply(ECDF)
f.apply(lambda x: x(0.1)).head()
Community
  • 1
  • 1
mathtick
  • 6,487
  • 13
  • 56
  • 101
  • If `F` is a function that returns a function, then I'm not sure how `df.groupby(pandas.TimeGrouper('10d')).apply(F)` is going to work. Can you give a concrete example? – unutbu Sep 01 '14 at 22:17
  • I've added an example with a function returned. – mathtick Sep 01 '14 at 22:42

1 Answers1

1

Well, this is a hack, but you could at least achieve your goal like this:

import numpy as np
import pandas as pd
from statsmodels.tools.tools import ECDF

s = pd.Series(np.random.randn(1000))
s.index = pd.date_range('2012-01-01', periods=s.shape[0], freq='D')
result = []
pd.rolling_apply(s, 10, lambda grp: result.append(ECDF(grp)) or 1)
print([f(0.1) for f in result])

The lambda function

lambda grp: result.append(ECDF(grp)) or 1

stores the ECDFs in the result list. Since result.append returns None, the expression result.append(ECDF(grp)) or 1 resolves to the numerical value 1 so pd.rolling_apply does not raise an error.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677