1

I have a pandas dataframe

from pandas import DataFrame, Series

where each row corresponds to one case, and each column corresponds to one month. I want to perform a rolling sum over each 12 month period. Seems simple enough, but I'm getting stuck with

result = [x for x.rolling_sum(12) in df.iterrows()]
result = [x for x.rolling_sum(12) in df.T.iteritems()]    

SyntaxError: can't assign to function call

a = []
for x in df.iterrows():
    s = x.rolling_sum(12)
    a.append(s)

AttributeError: 'tuple' object has no attribute 'rolling_sum'

dmvianna
  • 15,088
  • 18
  • 77
  • 106

1 Answers1

3

I think perhaps what you are looking for is

pd.rolling_sum(df, 12, axis=1)

In which case, no list comprehension is necessary. The axis=1 parameter causes Pandas to compute a rolling sum over rows of df.

For example,

import numpy as np
import pandas as pd
ncols, nrows = 13, 2
df = pd.DataFrame(np.arange(ncols*nrows).reshape(nrows, ncols))
print(df)
#    0   1   2   3   4   5   6   7   8   9   10  11  12
# 0   0   1   2   3   4   5   6   7   8   9  10  11  12
# 1  13  14  15  16  17  18  19  20  21  22  23  24  25

print(pd.rolling_sum(df, 12, axis=1))

prints

   0   1   2   3   4   5   6   7   8   9   10   11   12
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN   66   78
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN  222  234

Regarding your list comprehension:

You've got the parts of the list comprehension in the wrong order. Try:

result = [expression for x in df.iterrows()]

See the docs for more about list comprehensions.

The basic form of a list comprehension is

[expression for variable in sequence]

And the resultant list is equivalent to result after Python executes:

result = []
for variable in sequence:
    result.append(expression)

See this link for full syntax for list comprehensions.

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 1
    That solves the first problem, next problem is that iterrows() yields tuples and iteritems() yields key-value pairs, neither has a `rolling_sum` method. – PaulMcG Aug 12 '13 at 01:59
  • Thanks. For `[x.rolling_sum(12) for x in df.iterrows()]` I get `AttributeError: 'tuple' object has no attribute 'rolling_sum'` – dmvianna Aug 12 '13 at 02:00