3

I am trying to adjust how far the window slides in a sliding window. I see that there are a lot of posts about sliding windows on SO, however, I can’t seem to find a post that explains how to adjust how far the distance the sliding window slides. I am also not necessarily interested in chunking or only adjusting window size (1,2).

As an example If I had a string of six characters

seq = 'ATCGAC'

If I set the window size to 1 and I want the window to slide over 2 characters per step. I would want the following output:

Expected output:

['A', 'C', 'A']

Another example, if I have the same string and want to set the window size to 3 and the window to slide over 3 characters at a time. I would want the following output:

Expected output:

['ATC', 'GAC']

As a final example, a window size with a long string. With a sliding window size of 3 and adjusting the slide to slide over 6 characters at a time:

seq = 'ATCGACATCGACATCGAC'

Expected output:

['ATC', 'ATC', 'ATC']
neuron
  • 1,949
  • 1
  • 15
  • 30
  • The answer is really in the title. From the accepted answer of your [second link](https://stackoverflow.com/q/312443/6045800) you simply need to change the `step` argument for the `range` function... – Tomerikoo Dec 02 '21 at 16:25

3 Answers3

2

I'm sure there are more elegant solutions to this. Whenever I find myself using range(len(some_iterable)) I feel a little dirty.

That being said, you could achieve this with a simple generator.

def window(s: str, size: int, slide_amount: int):
    str_len = len(s)
    for i in range(0, str_len, slide_amount):
        # make sure nothing is yielded smaller than
        # the desired size
        if i + size <= str_len:
            yield s[i:i + size]

print([i for i in window('ATCGAC', 1, 2)]) # ['A', 'C', 'A']
print([i for i in window('ATCGAC', 3, 3)]) # ['ATC', 'GAC']
print([i for i in window('ATCGACATCGACATCGAC', 3, 6)]) # ['ATC', 'ATC', 'ATC']

Alternatively as a function wrapper around a generator expression.

def window(s: str, size: int, slide_amount: int):
    return (
        s[i:i + size] 
        for i in range(0, len(s), slide_amount) 
        if i + size <= len(s)
    )

Which could easily be modified to return a list instead.

def window(s: str, size: int, slide_amount: int):
    return [
        s[i:i + size] 
        for i in range(0, len(s), slide_amount) 
        if i + size <= len(s)
    ]
Axe319
  • 4,255
  • 3
  • 15
  • 31
1

What you describe is not a sliding window... It's just slicing really. According to Split string every nth character? we can generalize a function for easy set-up:

def custom_slice(iterable, size, step, start=0):
    return [iterable[i:i+size] for i in range(start, len(iterable)-size+1, step)]

And a few examples:

>>>custom_slice('ATCGAC', 1, 2)
['A', 'C', 'A']
>>> custom_slice('ATCGAC', 3, 3)
['ATC', 'GAC']
>>> custom_slice('ATCGACATCGACATCGAC', 3, 6)
['ATC', 'ATC', 'ATC']
>>> custom_slice([1, 2, 3, 4, 5, 6, 7, 8], 2, 4)
[[1, 2], [5, 6]]
>>> custom_slice('ATCGAC', 3, 1)  # that's actually a sliding window
['ATC', 'TCG', 'CGA', 'GAC']

You had a hard time finding the solution (I guess...) because in most of these questions/answers on the subject, there is always a n variable used for both the size and the step. The whole difference here is that we're separating them to two different variables for better control over the "window".

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
  • What is the difference between slicing and sliding window? I am just confused about what defines a sliding window as a sliding window? Does there need to be an overlap between the slices? – neuron Dec 02 '21 at 16:29
  • See for example [What is Sliding Window Algorithm? Examples?](https://stackoverflow.com/q/8269916/6045800). Technically speaking a sliding window usually overlaps. This is of course just technicality, I just think that it came in your way of searching/understanding a solution. The first link you posted is waaaay overcomplicated for what you need because it implements a sliding window on iterators, i.e. can't look forward and back. When dealing with strings, you don't have those restricting requirements... – Tomerikoo Dec 02 '21 at 16:32
  • I see what you're saying. In my question, I was really just trying to find a way to adjust how far the sliding window moved. If I knew a sliding window typically overlapped, I would have based my examples around that. – neuron Dec 02 '21 at 16:41
  • Anyways, thank you for the clearifcaion. I think I have a much better understanding now. I appreciate your help – neuron Dec 02 '21 at 16:43
  • 1
    @neuron In my opinion, sliding window isn't directly about overlaps but about keeping track of some information in the window (like counts of each character) and just updating that information when you slide the window. Instead of recomputing that information from scratch for every window position. Of course that only makes sense if there are overlaps. – Kelly Bundy Dec 02 '21 at 17:27
0

You can slice the string in a while loop. This is not a sliding window.

seq = 'ATCGAC'
def get_slices(seq, window, slide):
    result = []
    index = 0
    while(index < len(seq)):
        result.append(seq[index: index+window])
        index += slide
    return result 
print(get_slices(seq, 1, 2))
Albin Paul
  • 3,330
  • 2
  • 14
  • 30