0

I have a pandas Series:

0          3456
1          8526
2          3216
3          1236
4          7777
5          7778
6          7578
7          8778
8          7878
9          7078

I want to convert this to a DataFrame like so:

0 3456 8526 3216 1236 7777
1 8526 3216 1236 7777 7778
2 3216 1236 7777 7778 7578
3 1236 7777 7778 7578 8778
4 7777 7778 7578 8778 7878
5 7778 7578 8778 7878 7078

This DataFrame has window size of 5 (columns) based on a sliding window of 1 of the original Series. It should be dynamic such that it can accommodate any positive number of sliding window. For example with sliding window of 2 (last value is discarded):

0 3456 8526 3216 1236 7777
1 3216 1236 7777 7778 7578
2 7777 7778 7578 8778 7878

The difference between my problem and existing solutions is that my pandas Series is very large and current solutions often lead to OOM errors when I have a large window size (above 10000, which I require).

To summaries:

  1. Convert Series to DataFrame as shown above, while
  2. not running into OOM error for large window size, and
  3. is dynamic to any positive integer of sliding window

Any help is appreciated :)

EDIT: one can assume the number of rows in the original Series is in the millions

ben
  • 159
  • 1
  • 4
  • 15
  • 1
    In the accepted answer of the linked question, using approach 2, your window size is `L` e.g. 5, and what you call "sliding window" is stride length / step size `S` e.g. 2. So `pd.DataFrame(strided_app(series.to_numpy(), window_size, step_size))` should do. – Mustafa Aydın Apr 18 '21 at 09:22
  • Yes I think this does indeed solve my issue. Thanks. – ben Apr 18 '21 at 09:26

0 Answers0