3

Problem description: I'm interested in looking at terms in the text window of, say, 3 words to the left and 3 to the right. The base case has the form of w-3 w-2 w-1 term w+1 w+2 w+3. I want to implement a sliding window over my text with which I will be able to record the context words of each term. So, every word is once treated as a term, but when the window moves, it becomes a context word, etc. However, when the term is the 1st word in line, there are no context words on the left (t w+1 w+2 w+3), when it's the 2nd word in line, there's only one context word on the left, and so on. So, I am interested in any hints for implementing this flexible sliding window (in Python) without writing and specifying separately each possible situation.

To recap:

Example of input:

["w1", "w2", "w3", "w4", "w5", "w6", "w7", "w8", "w9", "w10"]

Output:

t1 w2 w3 w4

w1 t2 w3 w4 w5

w1 w2 t3 w4 w5 w6

w1 w2 w3 t4 w5 w6 w7

__ w2 w3 w4 t5 w6 w7 w8

__ __ etc.

My current plan is to implement this with a separate condition for each line in the output.

sim
  • 992
  • 1
  • 7
  • 12

1 Answers1

7

If you want a sliding window of n words, use a double-ended queue with maximum length n to implement a buffer.

This should illustrate the concept:

mystr = "StackOverflow"    
from collections import deque    
window = deque(maxlen=5)
for char in mystr:
    window.append(char)
    print ( ''.join(list(window)) )

Output:

S
St
Sta
Stac
Stack
tackO
ackOv
ckOve
kOver
Overf
verfl
erflo
rflow
Li-aung Yip
  • 12,320
  • 5
  • 34
  • 49
  • 1
    Note that the [maxlen arg](http://docs.python.org/library/collections.html#collections.deque.maxlen) was introduced in python 2.7 – jrennie May 08 '12 at 14:56
  • Thanks, Li-aung, this was useful for me. I am now recording contexts of a term with deque running from beginning and the end of the file. What I needed is deque's flexibility to store a maximum length of elements but possibly also less. – sim May 09 '12 at 14:05