4

I've reproduced the sliding window code shown here, but I need to modify it to jump two elements at a time instead of just one.

Original code:

def window(seq, n=3):
    it= iter(seq)
    result = list(islice(it, n))
    if len(result) == n:
        yield result
    for elem in it:
        result = result[1:] + [elem,]
        yield result

If I start with the following list:

My_List= ['adl_01_11', 'adl_01_12', 'adl_01_13', 'adl_01_14', 'adl_02_15', 'adl_02_16', 'adl_02_17', 'adl_02_18', 'adl_02_19', 'adl_02_20', 'adl_02_21', 'adl_02_22']

and I apply the window over My_List, I get the following result:

[['adl_01_11', 'adl_01_12', 'adl_01_13'], ['adl_01_12', 'adl_01_13', 'adl_01_14'], ['adl_01_13', 'adl_01_14', 'adl_02_15'], ['adl_01_14', 'adl_02_15', 'adl_02_16'], ['adl_02_15', 'adl_02_16', 'adl_02_17'], ['adl_02_16', 'adl_02_17', 'adl_02_18'], ['adl_02_17', 'adl_02_18', 'adl_02_19'], ['adl_02_18', 'adl_02_19', 'adl_02_20'], ['adl_02_19', 'adl_02_20', 'adl_02_21'], ['adl_02_20', 'adl_02_21', 'adl_02_22']]

How do I change this function if I want to iterate through 2 items at a time? That means I expect a result like this:

[['adl_01_11', 'adl_01_12', 'adl_01_13'], ['adl_01_13', 'adl_01_14', 'adl_01_15'], ['adl_01_15', 'adl_01_16', 'adl_02_17'], ['adl_01_17', 'adl_02_18', 'adl_02_19'], ['adl_02_19', 'adl_02_20', 'adl_02_21']]

Notice that adl_02_22 is no longer in the results, and my window iterates every 2 items.

In the window function I tried changing result[1:] to result[2:] but it doesn't work well. Any idea?

joanis
  • 10,635
  • 14
  • 30
  • 40
QQ1821
  • 143
  • 9
  • is using `yield` necessary? – nobleknight Apr 29 '21 at 13:50
  • Maybe `result = result[2:] + [elem, next(it)]`, with a try catch to catch the `StopIteration` exception if there is an element left-over? It's not sure elegant, though. It might be best to unroll the inner `for`. – joanis Apr 29 '21 at 13:52
  • 1
    Does this answer your question? [Rolling or sliding window iterator?](https://stackoverflow.com/questions/6822725/rolling-or-sliding-window-iterator) – dantiston Apr 29 '21 at 14:34
  • @dantiston That's where OP found the code they asked to modify, so clearly, no! That answer rolls one item at a time, OP wants modify the code to roll two items at a time. – joanis Apr 29 '21 at 15:15
  • @QQ1821 Please look at my updated solution for a more general answer, since I had first hard-coded the window size at exactly 3. – joanis Apr 29 '21 at 16:38

3 Answers3

3

I propose three solutions for this problem:

  1. a specific one for a window of size 3 with a step of 2,
  2. a general one with any window and step size,
  3. skip all this and use existing libraries.

Solution 1: sliding window with hard-coded size=3 and step=2

If you replace the for elem in it: loop by the equivalent while True: loop which tries next(it) until StopIteration is raised, that will let you use next(it) twice per iteration instead of just once:

def window_size3_step2(seq):
    it = iter(seq)

    try:
        result = [0,0,next(it)]
    except StopIteration:
        return

    while True:
        try:
            result = [result[2], next(it), next(it)]
        except StopIteration:
            break
        else:
            yield result


My_List= ['adl_01_11', 'adl_01_12', 'adl_01_13', 'adl_01_14', 'adl_02_15', 'adl_02_16', 'adl_02_17', 'adl_02_18', 'adl_02_19', 'adl_02_20', 'adl_02_21', 'adl_02_22']

print(f"{list(window_size3_step2(My_List))}")

Output:

[['adl_01_11', 'adl_01_12', 'adl_01_13'], ['adl_01_13', 'adl_01_14', 'adl_02_15'], ['adl_02_15', 'adl_02_16', 'adl_02_17'], ['adl_02_17', 'adl_02_18', 'adl_02_19'], ['adl_02_19', 'adl_02_20', 'adl_02_21']]

Testing for shorter lists:

for n in range(7):
    print(f"len={n} result={list(window_size3_step2(range(n)))}")

len=0 result=[]
len=1 result=[]
len=2 result=[]
len=3 result=[[0, 1, 2]]
len=4 result=[[0, 1, 2]]
len=5 result=[[0, 1, 2], [2, 3, 4]]
len=6 result=[[0, 1, 2], [2, 3, 4]]

Solution 2: general window function with arbitrary size and step

This second solution goes back to using islice to take into account the given window size argument, which I've renamed size for clarify, and accepts a step argument that can also take any positive integer value.

from itertools import islice
def window(seq, size=3, step=1):
    if size < 1 or step < 1:
        raise ValueError("Nobody likes infinite loops.")
    it = iter(seq)
    result = list(islice(it, size))
    while len(result) == size:
        yield result
        if step >= size:
            result = list(islice(it, step-size, step))
        else:
            result = result[step:] + list(islice(it, step))

On your input list, window(My_List, size=3, step=2), or just window(My_List, step=2), returns the list of lists you want.

I've also tested this with a wide variety of seq length, size and step, and I can confirm it works correctly in all cases. E.g., the output of this loop (try it yourself, I don't want to paste this long output here) is correct on every line:

for input_size in range(10):
    for window_size in range(1,4):
        for step_size in range(1,4):
            print(f"len={input_size} size={window_size} step={step_size} "
                  f"result={list(window(range(input_size), size=window_size, step=step_size))}")

Solution 3: there's a library for this!

The more_itertools library already provides a function doing just this:

I had to install it first:

pip3 install more_itertools

Use it:

from more_itertools import windowed
print(f"{list(windowed(My_List, 3, step=2))}")

[('adl_01_11', 'adl_01_12', 'adl_01_13'), ('adl_01_13', 'adl_01_14', 'adl_02_15'), ('adl_02_15', 'adl_02_16', 'adl_02_17'), ('adl_02_17', 'adl_02_18', 'adl_02_19'), ('adl_02_19', 'adl_02_20', 'adl_02_21'), ('adl_02_21', 'adl_02_22', None)]

It's not exactly what you asked for, though, because it pads the last incomplete window with None (or any fill value you provide) instead of truncating the end.

While using existing libraries is often a good choice, I learned more creating solutions 1 and 2, and I hope you find value in the progression.

Credits:

I found the more_itertools solution here: https://stackoverflow.com/a/46412374/3216427

joanis
  • 10,635
  • 14
  • 30
  • 40
  • Your output does not match the expected output that the OP posted. – Kapocsi Apr 29 '21 at 14:03
  • Sorry @Kapocsi I cannot see the difference. Can you pinpoint it for me and I'll adjust my answer as needed? – joanis Apr 29 '21 at 14:10
  • I think the OP transcribed his expected result incorrectly: your_output[1][2] != op_expected_output[1][2], your_output[2][0] != op_expected_output[2][0], your_output[2][1] != op_expected_output[2][1], your_output[3][0] != op_expected_output[3][0] – Kapocsi Apr 29 '21 at 14:19
  • Thank you @Kapocsi I see what you mean, now. I will assume each instance of `adl_01_15` in the expected output was meant to be `adl_02_15`, since that's what was in the input list. – joanis Apr 29 '21 at 14:25
  • NP. It looks like `adl_01_15`, `adl_01_16`, and `adl_01_17` are not in the original list. – Kapocsi Apr 29 '21 at 14:27
  • @joanis Indeed, I did an error and take the wrong list but anyway thanks a lot for your time and your answer, that's exactly what I needed – QQ1821 Apr 30 '21 at 06:32
  • I just read through all the answers in the question you started with, and I noticed that one of the answers (https://stackoverflow.com/a/46412374/3216427) mentions `more_itertools.windowed` (https://more-itertools.readthedocs.io/en/stable/api.html#more_itertools.windowed), which does exactly what I did, but with more features supported. – joanis Apr 30 '21 at 13:25
2

I think you may have transcribed your expected output incorrectly.

The items in your expected output, (adl_01_15, adl_01_16, adl_01_17), do not exist in My_List.

If so, this will do:

islice(window(My_List), 0, None, 2)

and if you don't need a generator:

list(window(My_List))[::2]
Kapocsi
  • 922
  • 6
  • 17
1
# input
lst = ['adl_01_11', 'adl_01_12', 'adl_01_13', 'adl_01_14', 'adl_02_15', 'adl_02_16', 'adl_02_17', 'adl_02_18', 'adl_02_19', 'adl_02_20', 'adl_02_21', 'adl_02_22']
# remove last entry if required
lst = lst[:-(len(lst[1:]) % 2)]
# get midpoints of sublist and add previous and following value to it
lst = [[lst[i-1], x, lst[i+1]] for i, x in enumerate(lst) if ((i+1) % 2) == 0]

print(lst)
# [['adl_01_11', 'adl_01_12', 'adl_01_13'],
#  ['adl_01_13', 'adl_01_14', 'adl_02_15'],
#  ['adl_02_15', 'adl_02_16', 'adl_02_17'],
#  ['adl_02_17', 'adl_02_18', 'adl_02_19'],
#  ['adl_02_19', 'adl_02_20', 'adl_02_21']]
Andreas
  • 8,694
  • 3
  • 14
  • 38