2

I have a list:

["toaster", "oven", "door"]  

I need to get ALL the possible sequential words that can be created. The output should look like this:

["toaster", "toaster oven", "toaster oven door", "oven", "oven door", "door"]

What is the most efficient way to get this list? I've looked at itertools.combinations() and a few other suggestions found on Stack Overflow, but nothing that would produce this exact result.

For example, the above list is not a powerset, because only words adjacent to each other in the input list should be used. A powerset would combine toaster and door into toaster door, but those two words are not adjacent.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
django-d
  • 2,210
  • 3
  • 23
  • 41

3 Answers3

10

You can do it like this:

words = ["toaster", "oven", "door"]  

length = len(words)
out = []
for start in range(length):
    for end in range (start+1, length+1):
        out.append(' '.join(words[start:end]))

print(out)

# ['toaster', 'toaster oven', 'toaster oven door', 'oven', 'oven door', 'door']

You just need to determine the first and last word to use.

You could also use a list comprehension:

[' '.join(words[start:end]) for start in range(length) for end in range(start+1, length+1)]

#['toaster', 'toaster oven', 'toaster oven door', 'oven', 'oven door', 'door']
Thierry Lathuille
  • 23,663
  • 10
  • 44
  • 50
  • Note: you can use `for start in range(L-1)` with `L = len(words)` because: 1) using lowercase `l` is **evil** and 2) the OP does not want the empty string and you are doing an unneccessary iteration anyway. – Giacomo Alzetta May 21 '18 at 12:13
  • Not sure why you range `start` all the way through to length + 1, you can safely remove the `+1` there. It's not a *problem* because `range(3, 3)` is empty, but you make Python do that extra step without reason. – Martijn Pieters May 21 '18 at 12:32
  • @GiacomoAlzetta: no, not `L-1`, then `start` ends at `1`, so you'd never get `door` on its own. – Martijn Pieters May 21 '18 at 12:33
  • @GiacomoAlzetta: and no to using `L` either. Just use `length`. – Martijn Pieters May 21 '18 at 12:34
3

You want to create sliding windows of increasing length, use the window() function from the top answer there inside a range() loop to increment the lengths:

from itertools import islice, chain

# window definition from https://stackoverflow.com/a/6822773

def increasing_slices(seq):
    seq = list(seq)
    return chain.from_iterable(window(seq, n=i) for i in range(1, len(seq) + 1))

for combo in increasing_slices(["toaster", "oven", "door"]):
    print(' '.join(combo))

This outputs:

toaster
oven
door
toaster oven
oven door
toaster oven door
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
0
import itertools

a = ['toaster', 'over', 'door']

result = []
for i in [itertools.combinations(a, x + 1) for x in range(len(a))]:
    result += [' '.join(e) for e in list(i)]

print(result)

What do you think about this solution? The result is:

['toaster', 'over', 'door', 'toaster over', 'toaster door', 'over door', 'toaster over door']
Michal
  • 1