3

Is there an efficient way in Python to get all partitions of a list of size n into two subsets of size n/2? I want to get some iterative construct such that each iteration provides two non-overlapping subsets of the original list, each subset having size n/2.

For example:

A = [1,2,3,4,5,6]    # here n = 6
# some iterative construct
    # in each iteration, a pair of subsets of size n/2
    # subsets = [[1,3,4], [2,5,6]] for example for one of the iterations
    # subsets = [[1,2,5],[3,4,6]] a different iteration example

The subsets should be non-overlapping, e.g. [[1,2,3], [4,5,6]] is valid but [[1,2,3], [3,4,5]] is not. The order of the two subsets does not matter, e.g. [[1,2,3], [4,5,6]] does not count as different from [[4,5,6], [1,2,3]] and thus only one of those two should appear in an iteration. The order within each subset also does not matter, so [[1,2,3], [4,5,6]], [[1,3,2], [4,5,6]], [[3,2,1], [6,5,4]], etc. all count as the same and so only one of them should show up in whole iteration.

Amit Kumar Gupta
  • 17,184
  • 7
  • 46
  • 64
user3501476
  • 1,095
  • 2
  • 14
  • 26
  • Possible duplicate of [How do you split a list into evenly sized chunks in Python?](http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python) – idjaw Mar 16 '16 at 02:08
  • @idjaw oh, i think you're more right. Can i cancel my duplicate warning? – pandorym Mar 16 '16 at 02:10
  • @idjaw I am actually very lowly of points, i can't vote on things. that was just a flag that i raised. – pandorym Mar 16 '16 at 02:13
  • 2
    @idjaw: This isn't a duplicate of that question how to split into evenly sized chunks; chunking is separate from arbitrary subsets. – ShadowRanger Mar 16 '16 at 02:18
  • Do the two lists of size n/2 have to be non-overlapping? Or is `[[1,2,3], [1,2,3]]` a valid iteration? – Amit Kumar Gupta Mar 16 '16 at 04:16
  • Also, what if n is odd? – Amit Kumar Gupta Mar 16 '16 at 04:17
  • @AmitKumarGupta Non-overlapping. Also, I am guaranteed that the length of the list will be even. – user3501476 Mar 16 '16 at 04:17
  • Also, you're using the terminology "choose" and "combination". I take that to mean order doesn't matter? However you also say "sublist", which implies order matters. Do you intend to treat `[[1,2,3], [4,5,6]]` as the same as `[[3,2,1], [6,5,4]]`? What about `[[4,5,6], [1,2,3]]`? – Amit Kumar Gupta Mar 16 '16 at 04:18
  • The ones that you listed should all count as one (they should be treated as the same). – user3501476 Mar 16 '16 at 04:20
  • Do you want the output to include both `[[1,2,3], [4,5,6]]` **and** `[[4,5,6], [1,2,3]]`, or just one of those pairs? – PM 2Ring Mar 16 '16 at 04:24
  • @PM2Ring No, I do not. I also would not like to include for example `[[1,2,3], [4,5,6]]` and `[[2,1,3],[4,5,6]] – user3501476 Mar 16 '16 at 04:28
  • FWIW, I've added a faster, simpler itertools solution to my answer. – PM 2Ring Mar 16 '16 at 06:08

5 Answers5

5

You will want to use itertools.combinations to do this. The inputs are the list you want to select items out of and the second is the number of items to select.

result = [list(item) for item in itertools.combinations(input, len(input) // 2)]

For an input of [1,2,3,4] this yields

[[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]

As @ShadowRanger pointed out, if order matters in your lists and you want all permutations, you'll want to substitute itertools.permutations into the solution.

result = [list(item) for item in itertools.permutations(input, len(input) // 2)]
# [[1, 2], [1, 3], [1, 4], [2, 1], [2, 3], [2, 4], [3, 1], [3, 2], [3, 4], [4, 1], [4, 2], [4, 3]]

Edit

Upon reading your question closer it is unclear if you want all n/2 permutations like I have shown or you want a list of lits where each element is yet another list of the two "halves" of the permutation.

To accomplish this, you could do the following (incorporating some indexing help from @Blckknght)

result = [[list(item[::2]), list(item[1::2])] for item in itertools.permutations(input)]

In this case, the output of [1,2,3,4] would be

[[[1, 3], [2, 4]], [[1, 4], [2, 3]], [[1, 2], [3, 4]], [[1, 4], [3, 2]], [[1, 2], [4, 3]], [[1, 3], [4, 2]], [[2, 3], [1, 4]], [[2, 4], [1, 3]], [[2, 1], [3, 4]], [[2, 4], [3, 1]], [[2, 1], [4, 3]], [[2, 3], [4, 1]], [[3, 2], [1, 4]], [[3, 4], [1, 2]], [[3, 1], [2, 4]], [[3, 4], [2, 1]], [[3, 1], [4, 2]], [[3, 2], [4, 1]], [[4, 2], [1, 3]], [[4, 3], [1, 2]], [[4, 1], [2, 3]], [[4, 3], [2, 1]], [[4, 1], [3, 2]], [[4, 2], [3, 1]]]

Edit2

Since order doesn't matter but you want an approach similar to the last one (lists of lists of lists), that's a little tricky with the last approach because of the array slicing. One alternative is to use set and frozenset to construct the initial information (rather than lists), because in a set the ordering doesn't matter when checking for equality. This will automatically allow us to remove duplicates. We can then add an extra step to convert back to a list if that's what you prefer.

from itertools import permutations
tmp = set([frozenset([frozenset(k[::2]),frozenset(k[1::2])]) for k in permutations(input)]) 
result = [[list(el) for el in item] for item in tmp];

This will yield

[[[1, 2], [3, 4]], [[2, 3], [1, 4]], [[1, 3], [2, 4]]]
Community
  • 1
  • 1
Suever
  • 64,497
  • 14
  • 82
  • 101
  • 1
    Possible they'd want `itertools.permutations` instead, unclear on whether order counts (they say "permutation", though the example has ordered selection, and the term "sublist" would often mean "combination", not "permutation"). Just mentioning it in case. – ShadowRanger Mar 16 '16 at 02:20
  • 1
    `[item[::2], item[1::2]]` would be another way of slicing the permutations in half, without needing to find the half-length first (not that that's a costly operation, of course). – Blckknght Mar 16 '16 at 02:32
  • @Suever The last post is exactly what I need. Thanks! – user3501476 Mar 16 '16 at 03:19
  • @Blckknght would there also be a way of slicing combinations in a similar manner? – user3501476 Mar 16 '16 at 03:55
  • @user3501476: No, I don't think it would work right with the combinations as they're already being made in groups of `n//2`. Since they're ordered too, a slice wouldn't get an even sample. I'm pretty sure the `permutations` version is what you want anyway. – Blckknght Mar 16 '16 at 04:34
  • @Blckknght The problem is that I realized I don't want for example [ [[1, 2], [3, 4]], [[1, 2], [4, 3]], [[2,1], [4, 3]], [ [4,3], [2,1] ]... ]. So basically I don't care about order as well. – user3501476 Mar 16 '16 at 04:45
  • @user3501476 Answer has been updated with a solution that should yield what you want – Suever Mar 16 '16 at 10:58
4

Here's a solution which doesn't use itertools. It uses a trick called Gosper's hack to generate bit permutations. See HAKMEM Item 175 for an explanation of how it works; this hack is also mentioned in the Wikipedia article Combinatorial number system. And it features in the accepted answer to this SO question: Iterating over all subsets of a given size.

The parts function is a generator, so you can use it in a for loop, as illustrated in my test.

How it works.

To partiton a list of length n into pairs of sublists of length n/2 we use a binary number bits consisting of n/2 zero bits and n/2 one bits. A zero bit in a given position indicates that the corresponding list element goes into the left sublist, a one bit in a given position indicates that the corresponding list element goes into the right sublist.

Initially, bits is set to 2 ** (n/2) - 1, so if n = 6, bits starts out as 000111.

The generator uses Gosper's hack to permute bits in numerical order, stopping when we get a one bit in the highest position, since that's when we start getting the reversed versions of our sublist pairs.

The code responsible for converting the pattern in bit into the pair of sublists is:

    for i, u in enumerate(lst):
        ss[bits & (1<<i) == 0].append(u)

If there's a zero at bit position i in bits then ss[0] gets the current item from lst, otherwise it's appended to ss[1].

This code runs on Python 2 and Python 3.

from __future__ import print_function

def parts(lst):
    ''' Generate all pairs of equal-sized partitions of a list of even length '''

    n = len(lst)
    if n % 2 != 0:
        raise ValueError('list length MUST be even')

    lim = 1 << (n - 1)
    bits = (1 << n // 2) - 1

    while bits < lim:
        #Use bits to partition lst
        ss = [[], []]
        for i, u in enumerate(lst):
            ss[bits & (1<<i) == 0].append(u)
        yield ss

        #Calculate next bits permutation via Gosper's hack (HAKMEM #175)
        u = bits & (-bits)
        v = bits + u
        bits = v | (((v ^ bits) // u) >> 2)

# Test
lst = list(range(1, 7))
for i, t in enumerate(parts(lst), 1):
    print('{0:2d}: {1}'.format(i, t))    

output

 1: [[1, 2, 3], [4, 5, 6]]
 2: [[1, 2, 4], [3, 5, 6]]
 3: [[1, 3, 4], [2, 5, 6]]
 4: [[2, 3, 4], [1, 5, 6]]
 5: [[1, 2, 5], [3, 4, 6]]
 6: [[1, 3, 5], [2, 4, 6]]
 7: [[2, 3, 5], [1, 4, 6]]
 8: [[1, 4, 5], [2, 3, 6]]
 9: [[2, 4, 5], [1, 3, 6]]
10: [[3, 4, 5], [1, 2, 6]]

I admit that using something inscrutable like Gosper's hack isn't exactly Pythonic. :)


Here's how you capture the output of parts into a list of all the sublists. It also illustrates that parts can handle string input, although it produces the output as lists of strings.

seq = list(parts('abcd'))
print(seq)

output

[[['a', 'b'], ['c', 'd']], [['a', 'c'], ['b', 'd']], [['b', 'c'], ['a', 'd']]] 


Here's another solution, using itertools to generate the combinations. It generates the pairs in a different order to the earlier version. However, it's shorter and easier to read. More importantly, it's significantly faster, between 50 to 100 percent faster in my timeit tests, depending on the list length; the difference appears to get smaller for longer lists.

def parts(lst):
    n = len(lst)
    if n % 2 != 0:
        raise ValueError('list length MUST be even')

    first = lst[0]
    for left in combinations(lst, n // 2):
        if left[0] != first:
            break
        right = [u for u in lst if u not in left]
        yield [list(left), right]

# Test
lst = list(range(1, 7))
for i, t in enumerate(parts(lst), 1):
    print('{0:2d}: {1}'.format(i, t))    

output

 1: [[1, 2, 3], [4, 5, 6]]
 2: [[1, 2, 4], [3, 5, 6]]
 3: [[1, 2, 5], [3, 4, 6]]
 4: [[1, 2, 6], [3, 4, 5]]
 5: [[1, 3, 4], [2, 5, 6]]
 6: [[1, 3, 5], [2, 4, 6]]
 7: [[1, 3, 6], [2, 4, 5]]
 8: [[1, 4, 5], [2, 3, 6]]
 9: [[1, 4, 6], [2, 3, 5]]
Community
  • 1
  • 1
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
1

Here's an itertools-based generator that I think yields exactly the values you want.

def sub_lists(sequence):
    all_but_first = set(sequence[1:])
    for item in itertools.combinations(sequence[1:], len(sequence)//2 - 1):
        yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]

I avoid near-duplicate outputs in two ways as compared to a permutations based approach in Suever's answer. First, I avoid yielding both [["a", "b"], ["c", "d"]] and [["c", "d"], ["a", "b"]] by forcing all the results to have the first value of the input sequence in the first sublist. I avoid yielding [["a", "b"], ["c", "d"]] and [["a", "b"], ["d", "c"]] by building the second sublist using set-subtraction.

Note that yielding nested tuples might be a little more natural than nested lists. To do that, just change the last line to:

yield (sequence[0],) + item, tuple(all_but_first.difference(item))
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • Nice algorithm! I'm surprised that generating the combinations without the first element like this isn't significantly faster than my first combinations-based code that uses an `if` test to bail out of the loop. – PM 2Ring Mar 16 '16 at 08:14
1

Since none of the orders matter, but we're making a list of lists of lists (where order inherently matters), we can assume some invariants: in all pairs, the first element in the first pair is 1, and both lists in a pair are in sorted order.

A = [1,2,3,4,5,6]

from itertools import combinations
first, rest = A[0], A[1:]
result = [
            [
                list((first,) + X), 
                [x for x in rest if x not in X]
            ] 
            for X in combinations(rest, len(A)/2 - 1)
         ]
Amit Kumar Gupta
  • 17,184
  • 7
  • 46
  • 64
0

Here's some code that performs timeit tests on the various solutions to this problem.

To make the comparisons fair, I've commented out the argument-checking test in my functions.

I've also added a function parts_combo_set that combines features of my combinations-based answer with that of Blckknght. It appears to be the fastest, except for very small lists.

#!/usr/bin/env python

''' Generate all pairs of equal-sized partitions of a list of even length

    Testing the speed of the code at
    http://stackoverflow.com/q/36025609/4014959

    Written by PM 2Ring 2016.03.16
'''

from __future__ import print_function, division
from itertools import combinations
from timeit import Timer

def parts_combo(lst):
    n = len(lst)
    #if n % 2 != 0:
        #raise ValueError('list length MUST be even')

    first = lst[0]
    for left in combinations(lst, n // 2):
        if left[0] != first:
            break
        right = [u for u in lst if u not in left]
        yield [list(left), right]

def parts_combo_set(lst):
    n = len(lst)
    #if n % 2 != 0:
        #raise ValueError('list length MUST be even')

    first = lst[0]
    allset = set(lst)
    for left in combinations(lst, n // 2):
        if left[0] != first:
            break
        yield [list(left), list(allset.difference(left))]

def parts_gosper(lst):
    n = len(lst)
    #if n % 2 != 0:
        #raise ValueError('list length MUST be even')

    lim = 1 << (n - 1)
    bits = (1 << n // 2) - 1

    while bits < lim:
        #Use bits to partition lst
        ss = [[], []]
        for i, u in enumerate(lst):
            ss[bits & (1<<i) == 0].append(u)
        yield ss

        #Calculate next bits permutation via Gosper's hack (HAKMEM #175)
        u = bits & (-bits)
        v = bits + u
        bits = v | (((v ^ bits) // u) >> 2)

def sub_lists(sequence):
    all_but_first = set(sequence[1:])
    for item in combinations(sequence[1:], len(sequence)//2 - 1):
        yield [[sequence[0]] + list(item), list(all_but_first.difference(item))]

def amit(seq):
    first, rest = seq[0], seq[1:]
    return [
                [
                    list((first,) + t), 
                    [x for x in rest if x not in t]
                ]
                for t in combinations(rest, len(seq) // 2 - 1)
           ]

funcs = (
    parts_combo,
    parts_combo_set,
    parts_gosper,
    sub_lists,
    amit,
)

def rset(seq):
    fset = frozenset
    return fset([fset([fset(u),fset(v)]) for u,v in seq])

def validate():
    func = funcs[0]
    master = rset(func(lst))
    print('\nValidating against', func.func_name)
    for func in funcs[1:]:
        print(func.func_name, rset(func(lst)) == master)

def time_test(loops, reps):
    ''' Print timing stats for all the functions '''
    for func in funcs:
        fname = func.func_name
        print('\n%s' % fname)
        setup = 'from __main__ import lst,' + fname
        #cmd = 'list(%s(lst))' % fname
        cmd = 'for t in %s(lst):pass' % fname
        t = Timer(cmd, setup)
        r = t.repeat(reps, loops)
        r.sort()
        print(r)


num = 6
lst = list(range(1, num + 1))
print('num =', num)

#parts = funcs[0]
#for i, t in enumerate(parts(lst), 1):
    #print('{0:2d}: {1}'.format(i, t))

validate()
time_test(10000, 3)

outputs

time_test(10000, 3)

num = 6

Validating against parts_combo
parts_combo_set True
parts_gosper True
sub_lists True
amit True

parts_combo
[0.58100390434265137, 0.58798313140869141, 0.59674692153930664]

parts_combo_set
[0.74442911148071289, 0.77211689949035645, 0.79338312149047852]

parts_gosper
[1.0791628360748291, 1.089813232421875, 1.1191768646240234]

sub_lists
[0.77199792861938477, 0.79007697105407715, 0.81944608688354492]

amit
[0.60080099105834961, 0.60345196723937988, 0.60417318344116211]

time_test(1000, 3)

num = 8

Validating against parts_combo
parts_combo_set True
parts_gosper True
sub_lists True
amit True

parts_combo
[0.22465801239013672, 0.22501206398010254, 0.23627114295959473]

parts_combo_set
[0.29469203948974609, 0.29857206344604492, 0.30069589614868164]

parts_gosper
[0.43568992614746094, 0.44395184516906738, 0.44651198387145996]

sub_lists
[0.31375885009765625, 0.32754802703857422, 0.37077498435974121]

amit
[0.22520613670349121, 0.22674393653869629, 0.24075913429260254]

time_test(500, 3)

num = 10

parts_combo
[0.52618098258972168, 0.52645611763000488, 0.53866386413574219]

parts_combo_set
[0.40614008903503418, 0.41370606422424316, 0.41525506973266602]

parts_gosper
[1.0068988800048828, 1.026188850402832, 1.1649439334869385]

sub_lists
[0.48507213592529297, 0.50991582870483398, 0.51528096199035645]

amit
[0.48686790466308594, 0.52898812294006348, 0.68387198448181152]

time_test(100, 3)

num = 12

parts_combo
[0.47471189498901367, 0.47522807121276855, 0.4798729419708252]

parts_combo_set
[0.3045799732208252, 0.30534601211547852, 0.35700607299804688]

parts_gosper
[0.83456206321716309, 0.83824801445007324, 0.87273812294006348]

sub_lists
[0.36697721481323242, 0.36919784545898438, 0.38349604606628418]

amit
[0.40012097358703613, 0.40033888816833496, 0.40788102149963379]

time_test(50, 3)

num = 14

parts_combo
[0.97016000747680664, 0.97931098937988281, 1.2653739452362061]

parts_combo_set
[0.81669902801513672, 0.88839983940124512, 0.91469597816467285]

parts_gosper
[1.772817850112915, 1.9343690872192383, 1.9586498737335205]

sub_lists
[0.78162002563476562, 0.79451298713684082, 0.8126368522644043]

amit
[0.89046502113342285, 0.89572596549987793, 0.91031289100646973]

time_test(50, 3)

num = 16

parts_combo
[4.1981601715087891, 4.3565289974212646, 4.3795731067657471]

parts_combo_set
[2.5452880859375, 2.5757780075073242, 2.6059379577636719]

parts_gosper
[7.5856668949127197, 7.6066100597381592, 7.6397140026092529]

sub_lists
[3.677016019821167, 3.6800520420074463, 3.7420001029968262]

amit
[4.1738030910491943, 4.1768841743469238, 4.1960680484771729]

time_test(10, 3)

num = 18

parts_combo
[3.8362669944763184, 3.8807728290557861, 4.0259079933166504]

parts_combo_set
[1.9355819225311279, 1.9540839195251465, 1.9573280811309814]

parts_gosper
[6.3178229331970215, 6.6125278472900391, 7.0462160110473633]

sub_lists
[2.1632919311523438, 2.238231897354126, 2.2747220993041992]

amit
[3.6137850284576416, 3.6162960529327393, 3.6475629806518555]

time_test(10, 3)

num = 20

parts_combo
[16.874133110046387, 17.585763931274414, 19.725590944290161]

parts_combo_set
[7.5462148189544678, 7.5597691535949707, 7.8375740051269531]

parts_gosper
[27.312526941299438, 27.637516021728516, 28.016690015792847]

sub_lists
[7.7865769863128662, 7.8874318599700928, 8.5498230457305908]

amit
[15.554526805877686, 15.626868009567261, 16.224159002304077]

These tests were performed on a 2GHz single-core machine with 2 GB of RAM running Python 2.6.6.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182