7

I've just written a small recursive programme to generate all the possible subdivisions of a list:

def subdivisions(ls):
    yield [ls]
    if len(ls) > 1:
        for i in range(1, len(ls)):
            for lhs in subdivisions(ls[:i]):
                yield lhs + [ls[i:]]

>>> for x in subdivisions('abcd'): print x
... 
['abcd']
['a', 'bcd']
['ab', 'cd']
['a', 'b', 'cd']
['abc', 'd']
['a', 'bc', 'd']
['ab', 'c', 'd']
['a', 'b', 'c', 'd']

I've brute forced this and it's taken me a long time to figure out. I'm wondering what this is called, as I'm sure there is a name for it.

In general I'm wondering how to learn this stuff from a mathematical point of view and whether there are good well known programming libraries that cover useful algorithms like this (I know of https://docs.python.org/3/library/itertools.html )


[Edit] the question that this is marked as duplicate of - get all possible partitions of a set - gets a different answer.

It is looking for { {{1,2,3},{}} , {{1},{2,3}} , {{1,2},{3}} , {{1,3},{2}}, {{1},{2},{3}}} while a correct answer for me (in it's terminology) would be { {{1,2,3}} , {{1},{2,3}} , {{1,2},{3}} , {{1},{2},{3}}}

Also, the point of asking the question was to figure out what the terminology of this is; I'm calling it 'subdivisions'; that answer is calling it 'partitions'. I'm looking for a good resource which enumerates all of these patterns, so that people don't go reinventing the wheel.

EoghanM
  • 25,161
  • 23
  • 90
  • 123
  • permutation perhaps. – Yuriy Kravets Jul 06 '18 at 22:05
  • This is different; there are 24 permutations of 'abcd' and they don't preserve order. Similarly combinations do something different as you get subsets of the list of a single size. – EoghanM Jul 06 '18 at 22:11
  • https://en.wikipedia.org/wiki/Partition_of_a_set – Eugene Primako Jul 06 '18 at 22:15
  • Thanks very much @EugenePrimako ! The difference would be that this is a list (ordered) and not a set, and I want to preserve ordering. I've provided my own answer from a programming pov, what I'm hoping for in an answer is a pointer to some resources in order not to reinvent the wheel. – EoghanM Jul 06 '18 at 22:22
  • 1
    @Prune I don't understand how is this Python question a duplicate of the Java question you proposed? – pylang Jul 06 '18 at 22:46
  • 2
    @pylang: Thanks for the heads-up; I see the difference. Question is reopened. – Prune Jul 06 '18 at 23:00
  • Thanks @EugenePrimako, can you now put that comment in an answer?? And to reiterate; there's a meta element to this question; I find myself writing these sorts of things knowing in the back of my mind that someone has done it already, so I'm looking for how to get knowledge about terminology such as 'Power set' and 'Partition' that I didn't have the google-fu for. Are there any other libraries apart from itertools? – EoghanM Jul 06 '18 at 23:05
  • This might do the job a bit faster (using itertools): `from itertools import chain, combinations; s = 'abcd'; indices = chain(*(combinations(range(1, len(s)), i) for i in range(len(s)))); res = [[s[x:y] for x, y in zip((0,) + i, i + (len(s),))] for i in indices]; print(res)`. It's basically considering all combinations from 0-length up to N-1 where N is the sequence's length. Though I'm not sure how this kind of chain is called. – a_guest Jul 06 '18 at 23:30

3 Answers3

4

Let me give some mathematical interpretation of this problem.

Imagine: you have list abcd. If you put some separators in it - like a|bc|d - you'll divide it into sublists. All the possible separators are a|b|c|d (their count is N-1, where N is a size of list). Let's call them (separators) 1, 2, 3.

Then all the subdivisions of your list will be generated by all the combinations of set {1, 2, 3}. There will be 2**3 = 8 of them: each element can be in combination or not. (All these combinations are called powerset).

That can help you to list all the subdivisions without recursion: you just to iterate binary numbers from 0b000 to 0b111 inclusive (range(0, 2**(N-1))):

from itertools import zip_longest, chain

def yield_possible_splits(string):
    N = len(string)
    for i in range(2 ** (N-1)):
        spaces_bitmask = bin(i).replace('0b', '').rjust(N, '0')
        spaces = [' ' if bit == '1' else '' for bit in spaces_bitmask]
        yield ''.join(chain(*zip_longest(spaces, string, fillvalue='')))

Or equivalent variant using itertools.product instead of binary manipulations:

from itertools import zip_longest, chain, product

def yield_possible_splits(string):
    N = len(string)
    for spaces in product(['', ' '], repeat=N-1):
        yield ''.join(chain(*zip_longest(string, spaces, fillvalue='')))

Usage:

print(list(yield_possible_splits('abcd')))
# ['abcd', 'abc d', 'ab cd', 'ab c d', 'a bcd', 'a bc d', 'a b cd', 'a b c d']
Eugene Primako
  • 2,767
  • 9
  • 26
  • 35
  • I think it's `range(0b000, 0b1000)`? I'm missing the step about how to use the binary value of each separator to split the list! – EoghanM Jul 06 '18 at 23:12
  • @EoghanM Yes, I meant "inclusive". Updated my answer. – Eugene Primako Jul 06 '18 at 23:39
  • 1
    I prefer your second solution, but it gives me `['abcd', 'ab cd', 'a bcd', 'a b cd', ' abcd', ' ab cd', ' a bcd', ' a b cd']` (note the space before `a` and that there never is a space between `cd`). I think you need to swap the arguments to `zip_longest`. I think it is also clearer to use the `repeat` argument in `product`, as in `product(['', ' '], repeat=N-1)`. – Bas Swinckels Jul 07 '18 at 12:55
  • @BasSwinckels Thanks a lot! It was my typo when refactoring the code :( Concerning ```repeat``` - you are right, it looks much better now. – Eugene Primako Jul 07 '18 at 17:59
3

Finding all partitions of a list is equivalent to finding all sets of indices at which to slice the list.

By example, given the list l = [1, 2, 3, 4], we can represent the partition [[1, 2], [3], [4]] by the list of indices [2, 3]. In particular, there is a one-to-one correspondence between such list of indices and partitions.

This means, given a list l we can find the powerset of range(1, len(l)) and find each corresponding partition.

Code

This solution uses the powerset function from itertools recipes. Using generators is more efficient than using recursion.

from itertools import chain, combinations

def powerset(iterable):
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

def partitions(lst):
    for indices in powerset(range(1, len(lst))):
        partition = []
        i = 0
        for j in indices:
            partition.append(lst[i:j])
            i = j
        partition.append(lst[i:])

        yield partition

Example

print(*partitions([1, 2, 3]))
# [[1, 2, 3]] [[1], [2, 3]] [[1, 2], [3]] [[1], [2], [3]]
Olivier Melançon
  • 21,584
  • 4
  • 41
  • 73
  • Marking as the accepted answer because of the link to https://docs.python.org/3/library/itertools.html#itertools-recipes which is what I missed when looking up itertools to find an existing solution for this problem. – EoghanM Jul 13 '18 at 12:36
0

My solution:

from itertools import chain, product
def p(l):
    return {(l,)} | {tuple(chain(*s)) for i in range(1, len(l)) for s in product(p(l[:i]), p(l[i:]))}

p('abcd') returns:

{('a', 'bcd'), ('abcd',), ('abc', 'd'), ('ab', 'c', 'd'), ('ab', 'cd'), ('a', 'b', 'cd'), ('a', 'bc', 'd'), ('a', 'b', 'c', 'd')}
blhsing
  • 91,368
  • 6
  • 71
  • 106
  • Lovely concise solution - Python3 only right? Unfortunately doesn't work when a list is passed in: >>> p(['a', 'b', 'c', 'd']) TypeError: unhashable type: 'list' – EoghanM Jul 13 '18 at 12:22
  • Thanks. It works in Python 2.7 too. But you're right that passing a list to it wouldn't work because this solution relies on using set to eliminate duplicates, and a sequence needs to be hashable in order to be added to a set, which a list is not. You can always cast the list to a tuple before passing in though. – blhsing Jul 13 '18 at 12:37
  • Cool, generating duplicates and then discarding them seems less efficient than other solutions. – EoghanM Jul 20 '18 at 13:03