2

Pandas Rolling function

Last elements when window_size == step_size

I can't seem to get the last three element of an example 9 element series to be rolled on, when my window size and step size are both 3.

Is the below an intended behaviour of pandas?

My desired outcome

If so how can I roll over the Series so that:

pd.Series([1., 1., 1., 2., 2., 2., 3., 3., 3.]).rolling(window=3, step=3).mean()

evaluate to pd.Series([1., 2., 3.,])?

Example

    import pandas as pd

    def print_mean(x):
        print(x)
        return x.mean()

    df = pd.DataFrame({"A": [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]})

    df["left"] = (
        df["A"].rolling(window=3, step=3, closed="left").apply(print_mean, raw=False)
    )
    df["right"] = (
        df["A"].rolling(window=3, step=3, closed="right").apply(print_mean, raw=False)
    )
    df["both"] = (
        df["A"].rolling(window=3, step=3, closed="both").apply(print_mean, raw=False)
    )
    df["neither"] = (
        df["A"].rolling(window=3, step=3, closed="neither").apply(print_mean, raw=False)
    )

This evaluates to:

     A  left  right  both  neither
0  0.0   NaN    NaN   NaN      NaN
1  1.0   NaN    NaN   NaN      NaN
2  2.0   NaN    NaN   NaN      NaN
3  3.0   1.0    2.0   1.5      NaN
4  4.0   NaN    NaN   NaN      NaN
5  5.0   NaN    NaN   NaN      NaN
6  6.0   4.0    5.0   4.5      NaN
7  7.0   NaN    NaN   NaN      NaN
8  8.0   NaN    NaN   NaN      NaN

and prints:

0    0.0
1    1.0
2    2.0
dtype: float64
3    3.0
4    4.0
5    5.0
dtype: float64
1    1.0
2    2.0
3    3.0
dtype: float64
4    4.0
5    5.0
6    6.0
dtype: float64
0    0.0
1    1.0
2    2.0
3    3.0
dtype: float64
3    3.0
4    4.0
5    5.0
6    6.0
dtype: float64
semyd
  • 430
  • 4
  • 17
  • Are you sure you want to use step here, which "evaluate[s] the window at every step result, equivalent to slicing as [::step]?" – It_is_Chris Jan 19 '23 at 14:57
  • Yes, I would like to do convolution on these series. – semyd Jan 19 '23 at 15:14
  • Are you sure you understand what `step` is doing? -- try the following to see if this is still what you are looking for and play around with different steps, windows, min_periods, etc.: `for win in pd.Series([1., 1., 1., 2., 2., 2., 3., 3., 3.]).rolling(window=3, step=3): print(win)` – It_is_Chris Jan 19 '23 at 15:21
  • Yes, step configures how far the start of the next window's left index is compared to the previous one. – semyd Jan 19 '23 at 15:24
  • What I don't understand is, for eg.: in your example it outputs: ``` 0 1.0 dtype: float64 1 1.0 2 1.0 3 2.0 dtype: float64 4 2.0 5 2.0 6 3.0 dtype: float64 ``` Why does it handle the first element like that? – semyd Jan 19 '23 at 15:25
  • Because you have the step set to three so it is the equivalent of doing `s[::3]` (where `s` is the series). Notice the index - `[0, 3, 6]` You cannot look back 2 more steps from the first element in the array so it returns nan. Then the next index (3) you look back two more spots (index 1,2,3) and take the mean, which is (1+1+2)/3 and equals 1.33 – It_is_Chris Jan 19 '23 at 15:31
  • 2
    I am not sure if I got the question correctly but is it possible you are looking for something like this? `ser.groupby(ser.index // 3).mean()` (where `ser = pd.Series([1., 1., 1., 2., 2., 2., 3., 3., 3.])`). – ayhan Jan 19 '23 at 15:43
  • @user19308385 I think you are correct. Post it as an answer, you'll get +1 from me. – It_is_Chris Jan 19 '23 at 15:46
  • Thanks @It_is_Chris I understand now. Thanks user19308385 this is a working method in the example above. (However I'll have to adopt it to datetime.) – semyd Jan 19 '23 at 16:15

1 Answers1

0

You could try subsetting every 3 rows by using the % operator:

df[df.index % 3 == 0]

This will output:

enter image description here

Adam Jaamour
  • 1,326
  • 1
  • 15
  • 31