So if I had the numbers [1,2,2,3] and I want k=2 partitions I'd have [1][2,2,3], [1,2][2,3], [2,2][1,3], [2][1,2,3], [3][1,2,2], etc.
4 Answers
See an answer in Python at Code Review.

- 1
- 1

- 2,541
- 5
- 31
- 52
-
This code generates repeats and only takes contiguous subsets – KaliMa Mar 10 '13 at 00:56
-
to lose the repetitions you can do something like: set(results) – o17t H1H' S'k Mar 10 '13 at 00:57
-
did you look at the code in the first answer by Adeel Zafar Soomro? – o17t H1H' S'k Mar 10 '13 at 01:00
-
+1 for finding an existing solution while I was searching for ideas ... Also, the solution by user3569 may be slow but I've just run it and it definitely produces non-contiguous subsets and does not produce repeats. – Simon Mar 10 '13 at 01:10
-
@Simon That code does not work properly though, as there are times where it won't return a k-tuple but a k-1 tuple. edit: it's also severely limited by memory – KaliMa Mar 10 '13 at 01:15
-
It would be valuable to show one or more specific test cases where the solutions shown at Code Review fail to satisfy this question's requirements, and to say how they fail. – Simon Mar 10 '13 at 01:22
-
Indeed, for [2,2,2,2,3,3,5], k=3, user3569's solution produces five 2-tuples, instead of exclusively 3-tuples. – Simon Mar 10 '13 at 01:43
user3569's solution at Code Review produces five 2-tuples for the test case below, instead of exclusively 3-tuples. However, removing the frozenset()
call for the returned tuples leads to the code returning exclusively 3-tuples. The revised code is as follows:
from itertools import chain, combinations
def subsets(arr):
""" Note this only returns non empty subsets of arr"""
return chain(*[combinations(arr,i + 1) for i,a in enumerate(arr)])
def k_subset(arr, k):
s_arr = sorted(arr)
return set([i for i in combinations(subsets(arr),k)
if sorted(chain(*i)) == s_arr])
s = k_subset([2,2,2,2,3,3,5],3)
for ss in sorted(s):
print(len(ss)," - ",ss)
As user3569 says "it runs pretty slow, but is fairly concise".
(EDIT: see below for Knuth's solution)
The output is:
3 - ((2,), (2,), (2, 2, 3, 3, 5))
3 - ((2,), (2, 2), (2, 3, 3, 5))
3 - ((2,), (2, 2, 2), (3, 3, 5))
3 - ((2,), (2, 2, 3), (2, 3, 5))
3 - ((2,), (2, 2, 5), (2, 3, 3))
3 - ((2,), (2, 3), (2, 2, 3, 5))
3 - ((2,), (2, 3, 3), (2, 2, 5))
3 - ((2,), (2, 3, 5), (2, 2, 3))
3 - ((2,), (2, 5), (2, 2, 3, 3))
3 - ((2,), (3,), (2, 2, 2, 3, 5))
3 - ((2,), (3, 3), (2, 2, 2, 5))
3 - ((2,), (3, 5), (2, 2, 2, 3))
3 - ((2,), (5,), (2, 2, 2, 3, 3))
3 - ((2, 2), (2, 2), (3, 3, 5))
3 - ((2, 2), (2, 3), (2, 3, 5))
3 - ((2, 2), (2, 5), (2, 3, 3))
3 - ((2, 2), (3, 3), (2, 2, 5))
3 - ((2, 2), (3, 5), (2, 2, 3))
3 - ((2, 3), (2, 2), (2, 3, 5))
3 - ((2, 3), (2, 3), (2, 2, 5))
3 - ((2, 3), (2, 5), (2, 2, 3))
3 - ((2, 3), (3, 5), (2, 2, 2))
3 - ((2, 5), (2, 2), (2, 3, 3))
3 - ((2, 5), (2, 3), (2, 2, 3))
3 - ((2, 5), (3, 3), (2, 2, 2))
3 - ((3,), (2, 2), (2, 2, 3, 5))
3 - ((3,), (2, 2, 2), (2, 3, 5))
3 - ((3,), (2, 2, 3), (2, 2, 5))
3 - ((3,), (2, 2, 5), (2, 2, 3))
3 - ((3,), (2, 3), (2, 2, 2, 5))
3 - ((3,), (2, 3, 5), (2, 2, 2))
3 - ((3,), (2, 5), (2, 2, 2, 3))
3 - ((3,), (3,), (2, 2, 2, 2, 5))
3 - ((3,), (3, 5), (2, 2, 2, 2))
3 - ((3,), (5,), (2, 2, 2, 2, 3))
3 - ((5,), (2, 2), (2, 2, 3, 3))
3 - ((5,), (2, 2, 2), (2, 3, 3))
3 - ((5,), (2, 2, 3), (2, 2, 3))
3 - ((5,), (2, 3), (2, 2, 2, 3))
3 - ((5,), (2, 3, 3), (2, 2, 2))
3 - ((5,), (3, 3), (2, 2, 2, 2))
Knuth's solution, as implemented by Adeel Zafar Soomro on the same Code Review page can be called as follows if no duplicates are desired:
s = algorithm_u([2,2,2,2,3,3,5],3)
ss = set(tuple(sorted(tuple(tuple(y) for y in x) for x in s)))
I haven't timed it, but Knuth's solution is visibly faster, even for this test case.
However, it returns 63 tuples rather than the 41 returned by user3569's solution. I haven't yet gone through the output closely enough to establish which output is correct.
-
-
Given the difficulty that we have had in finding a solution that produces correct output, despite its memory and speed limitations, this solution could at least be the basis of an automated test of a solution that ran faster and used less memory. I am very much in favour of first getting code to work correctly, then thinking about speed. I think the next step might be to adapt Knuth's solution to meet the specifications of this question, because Knuth's solution will be much faster if we can get it to work for this problem. – Simon Mar 10 '13 at 02:34
-
Is there a way to modify Knuth's algorithm to not create so many duplicates? – KaliMa Mar 10 '13 at 03:51
-
I've edited my answer to include code that calls Knuth's algorithm, as expressed on the Code Review page, while dropping duplicates. – Simon Mar 10 '13 at 04:34
-
I believe that still generates the duplicates (removing them after the fact is not too hard to do); I mean is there a way to generate unique solutions directly? – KaliMa Mar 10 '13 at 04:36
-
Knuth's algorithm as implemented by Soomro does generate duplicates, but (if the outputs are correct, which we have yet to verify), it does so very quickly and removing them after the fact is not time-consuming, so it is an improvement on the earlier algorithm we were considering. – Simon Mar 10 '13 at 04:43
Here's a version in Haskell:
import Data.List (nub, sort, permutations)
parts 0 = []
parts n = nub $ map sort $ [n] : [x:xs | x <- [1..n`div`2], xs <- parts(n - x)]
partition [] ys result = sort $ map sort result
partition (x:xs) ys result =
partition xs (drop x ys) (result ++ [take x ys])
partitions xs k =
let variations = filter (\x -> length x == k) $ parts (length xs)
in nub $ concat $ map (\x -> mapVariation x (nub $ permutations xs)) variations
where mapVariation variation = map (\x -> partition variation x [])
OUTPUT:
*Main> partitions [1,2,2,3] 2
[[[1],[2,2,3]],[[1,2,3],[2]],[[1,2,2],[3]],[[1,2],[2,3]],[[1,3],[2,2]]]

- 23,602
- 3
- 25
- 61
Python solution:
pip install PartitionSets
Then:
import partitionsets.partition
filter(lambda x: len(x) == k, partitionsets.partition.Partition(arr))
The PartitionSets implementation seems to be pretty fast however it's a pity you can't pass number of partitions as an argument, so you need to filter your k-set partitions from all subset partitions.
You may also want to look at: similar topic on researchgate.

- 51
- 1
- 8