8

Can you access pandas rolling window object.

rs = pd.Series(range(10))
rs.rolling(window = 3)

#print's 
Rolling [window=3,center=False,axis=0]

Can I get as groups?:

[0,1,2]
[1,2,3]
[2,3,4]
Merlin
  • 24,552
  • 41
  • 131
  • 206

3 Answers3

5

I will start off this by saying this is reaching into the internal impl. But if you really really wanted to compute the indexers the same way as pandas.

You will need v0.19.0rc1 (just about released), you can conda install -c pandas pandas=0.19.0rc1

In [41]: rs = pd.Series(range(10))

In [42]: rs
Out[42]: 
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int64

# this reaches into an internal implementation
# the first 3 is the window, then second the minimum periods we
# need    
In [43]: start, end, _, _, _, _ = pandas._window.get_window_indexer(rs.values,3,3,None,use_mock=False)

# starting index
In [44]: start
Out[44]: array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])

# ending index
In [45]: end
Out[45]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

# windo size
In [3]: end-start
Out[3]: array([1, 2, 3, 3, 3, 3, 3, 3, 3, 3])

# the indexers
In [47]: [np.arange(s, e) for s, e in zip(start, end)]
Out[47]: 
[array([0]),
 array([0, 1]),
 array([0, 1, 2]),
 array([1, 2, 3]),
 array([2, 3, 4]),
 array([3, 4, 5]),
 array([4, 5, 6]),
 array([5, 6, 7]),
 array([6, 7, 8]),
 array([7, 8, 9])]

So this is sort of trivial in the fixed window case, this becomes extremely useful in a variable window scenario, e.g. in 0.19.0 you can specify things like 2S for example to aggregate by-time.

All of that said, getting these indexers is not particularly useful. you generally want to do something with the results. That is the point of the aggregation functions, or .apply if you want to generically aggregate.

Engineero
  • 12,340
  • 5
  • 53
  • 75
Jeff
  • 125,376
  • 21
  • 220
  • 187
  • I love this idea, but I can't get the call to `pd._window.get_window_indexer()` to work. I keep getting `module 'pandas' has no attribute '_window'`. Any idea how to get around this? Also adding a link to the source for `get_window_indexer()` since its documentation won't show up in my notebook. – Engineero Jul 31 '18 at 19:35
2

Here's a workaround, but waiting to see if anyone has pandas solution:

def rolling_window(a, step):
    shape   = a.shape[:-1] + (a.shape[-1] - step + 1, step)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

rolling_window(rs, 3)

array([[ 0,  1,  2],
       [ 1,  2,  3],
       [ 2,  3,  4],
       [ 3,  4,  5],
       [ 4,  5,  6],
       [ 5,  6,  7],
       [ 6,  7,  8],
       [ 7,  8,  9],
       [ 8,  9, 10]])
Merlin
  • 24,552
  • 41
  • 131
  • 206
0

This is solved in pandas 1.1, as the rolling object is now an iterable:

[window.tolist() for window in rs.rolling(window=3) if len(window) == 3]
Philipp
  • 1,191
  • 1
  • 14
  • 16