5

I am getting an exception when attempting to apply a custom rolling function to a pandas data frame. For example:

import statsmodels.api as sm
import pandas as pd
import numpy as np

def univar_regr_beta(y, x):
    Y, X = y.as_matrix(), x.as_matrix()

    X = sm.add_constant(X)

    model = sm.OLS(Y, X)
    return model.fit().params[1]

df = pd.DataFrame(np.random.randn(20,3))
srs = pd.Series(np.random.randn(20))

# this returns a value e.g.: 0.06608957
univar_regr_beta(df[0], srs)

# and this returns a rolling sum dataframe
df.rolling(5, 5).apply(np.sum)

# but this breaks when attemp to get rolling beta
df.rolling(5, 5).apply(lambda x: univar_regr_beta(x, srs))

Specifically the exception I get is the following:

AttributeError: 'numpy.ndarray' object has no attribute 'as_matrix'

It looks as though when each column is passed into univar_regr_beta via lambda, that it is being passed as a bumpy array as opposed to a Series. I am not sure if there is a better way to achieve a rolling beta, or if I am just missing something.

Any help is appreciated. Thanks

evariste galois
  • 135
  • 1
  • 5
  • 2
    You do not need the line `Y, X = y.as_matrix(), x.as_matrix()` whatsoever. Just use `x` and `y` and let the functions sort out the container issues. – DYZ Aug 05 '17 at 03:19
  • 2
    You have a more prominent problem in your code. The `lambda` parameter, `x`, has 5 elements (because such is the window size). So does `X` in `univar_regr_beta`. But `srs` has 20 items, and so does `Y`. You cannot construct an `sm.OLS` model when `x` and `y` have different lengths. – DYZ Aug 05 '17 at 03:31
  • thanks @DYZ. I regarding the difference in length of x and y going into the regression, I actually tried reindexing x, and y to the intersection of their indexes before regressing in univar_regr_beta, but I couldn't for the same reason - the y var coming from lambda is not a pandas object, its a numpy array. I also tried removing the as_matrix, but still no luck due to the above. Do you have a recommendation for better calculating a rolling beta for each column of one data frame to a separate series? In Wes' book I believe he uses pandas OLS but I'm pretty sure that is now deprecated. – evariste galois Aug 05 '17 at 17:17

0 Answers0