0

I have this:

def get_set(line, n=3):
    words = line.split()
    for i in range(len(words) - n):
        yield (words[i], words[i+1], words[i+2])

for i in get_set('This is a test'):
    print(i)

But as you can see in the yield call, it's hard-coded to work with 3. How can I rewrite the yield line to work with whatever number is passed via the n kwarg?

(the code generators sets of each three consecutive words in a sentence, want it to generate whatever I pass as n)

Wells
  • 10,415
  • 14
  • 55
  • 85
  • 1
    Are you looking for the `grouper` recipe [here](https://docs.python.org/2/library/itertools.html#recipes)? – BrenBarn Aug 06 '15 at 18:26
  • @BrenBarn I suspect they are... although his implementation is fundamentally different ... (ie [1,2,3,4,5,...] => [1,2,3],[2,3,4,],... – Joran Beasley Aug 06 '15 at 18:30
  • Are you trying to [chunk the list](http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python)? – TigerhawkT3 Aug 06 '15 at 18:41

3 Answers3

1

You could always just make a tuple out of the range

def get_set(line, n=3):
    words = line.split()
    for i in range(len(words) - (n-1)):
        yield tuple(words[i:i+n])

Note you need to iterate in range len(words) - (n-1) not len(words)-n to get all consecutive pairs.

With

for i in get_set('This is a very long test'):
    print(i)

This gives:

n=3:

('This', 'is', 'a') ('is', 'a', 'very') ('a', 'very', 'long') ('very', 'long', 'test')

n=4:

('This', 'is', 'a', 'very') ('is', 'a', 'very', 'long') ('a', 'very', 'long', 'test')

River
  • 8,585
  • 14
  • 54
  • 67
0
for row in zip(*[words[x:] for x in range(n)]):
    yield row

should work I think

 for i in range(len(words)-n):
     yield words[i:i+n]

should also work ...

(cast to tuple if needed ...)

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179
0

You can use slicing on the list:

def get_set(line, n=3):
    words = line.split()
    for i in range(0, len(words), n):
        yield words[i:i+n]

for i in get_set('This is a test'):
    print(i)

['This', 'is', 'a']
['test']

for i in get_set('This is another very boring test', n=2):
    print(i)

['This', 'is']
['another', 'very']
['boring', 'test']
Chaker
  • 1,197
  • 9
  • 22