permutations of identical elements: efficiently avoid redundant permutations

Question

My problem:

I have, e.g., 4 numbers

a, b, c, d = np.random.rand(4)

and I need all of the possible sums a + b - c, a + b -d, b + c - d, b + c -a etc etc

I have found that I can proceed like this

from itertools import permutations

p = np.array(list(set(permutations([1, 1, -1, 0])))).T
sums = np.random.rand(4) @ p

and so far so good, I've vectorized my problem... what hurts my feelings is the use of permutations(…) because the number of permutations, for n numbers to combine, is of course n! while the filter operated by set reduces the number of the significant permutations to a few percent.

My question:

Is it possible to obtain, e.g.,

p = np.array(list(set(permutations([1]*5 + [-1]*4 + [0]*6)))).T

without computing all the 1,307,674,368,000 (mostly identical) permutations?

Following the advice of Lukas Thaler and of Chris_Rands I have checked the possible use of sympy.utilities.iterables.multiset_permutations

In [1]: from itertools import permutations                                                

In [2]: from sympy.utilities.iterables import multiset, multiset_permutations             

In [3]: for n in (2,3): 
   ...:     l = [1]*n +[2]*n + [3]*n 
   ...:     ms = multiset(l) 
   ...:     print(n*3) 
   ...:     %timeit list(multiset_permutations(ms)) 
   ...:     %timeit set(permutations(l)) 
   ...:                                                                                   
6
568 µs ± 3.71 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
91.3 µs ± 571 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
9
9.32 ms ± 209 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
58.3 ms ± 323 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

It seems to me that

for negligible amounts of computation it doesn't matter anyway
when the things stiffen, the pure Python implementation in Sympy easily overcomes the brute force approach using set and itertools.permutations

so that I conclude that my question is a duplicate of the question referenced by Chris and I vote to close it.

this? https://stackoverflow.com/questions/6284396/permutations-with-unique-values — Chris_Rands, Dec 18 '19 at 14:32
https://stackoverflow.com/questions/46895696/prevent-duplicates-from-itertool-permutation — Chris_Rands, Dec 18 '19 at 14:33
@Chris_Rands Re https://stackoverflow.com/questions/46895696/prevent-duplicates-from-itertool-permutation accepted answer suggest to use `set` to filter `permutations` results, so I don't need that answer (that technique is already used in my question). I'll look at the other question you suggested, it looks more promising but at the moment I'm not sure it is a duplicate (the OP has an edit in which they say the problem is about `itertools.product` and that is not my case). — gboffi, Dec 18 '19 at 14:41
Here's a classic permutation algorithm which handles duplicated items correctly: https://stackoverflow.com/a/31678111/4014959 My code in that answer is for strings, but it should be easy to adapt it for your purposes. — PM 2Ring, Dec 18 '19 at 14:52
[sympy.utilities.iterables.multiset_permutations](https://docs.sympy.org/latest/modules/utilities/iterables.html#sympy.utilities.iterables.multiset_permutations) seems to solve your problem — Lukas Thaler, Dec 18 '19 at 15:05
@PM2Ring Thank you, I'll look into it. I was hoping for a C speed solution but it's likely that a better algorithm wins hands down for _n!_ sufficiently large. I'll update with some stats as soon as I have some time to benchmark. — gboffi, Dec 18 '19 at 15:28
@LukasThaler Thank you Lukas, the `multiset_permutations` is cited in an answer to https://stackoverflow.com/questions/6284396/permutations-with-unique-values, reported by Chris Rands. I fear that it's a pure Python implementation, again it's a matter of benchmarking when algorithm beats brute force... — gboffi, Dec 18 '19 at 15:31

Kyle Parsons · Answer 1 · 2019-12-18T15:23:56.873

1

We can do this by first using combinations to pick the numbers used in the summation and then using it again to pick the positive (and negative) numbers.

from itertools import combinations, chain
from random import random

def our_sums_of(nums, num_positive):
    return (2 * sum(positives) - sum(nums) 
            for positives in combinations(nums, num_positive))

nums = [random() for _ in range(15)]
num_positive = 5
num_negative = 4

list(chain.from_iterable(map(lambda c: our_sums_of(c, num_positive), 
                             combinations(nums, num_positive + num_negative))))

edited Dec 18 '19 at 15:23

answered Dec 18 '19 at 14:57

Kyle Parsons

1,475
6
14

OP has 4 numbers in their post though, and they may be looking for a program which can handle any amount of numbers. – AMC Dec 18 '19 at 15:03
@AlexanderCécile I see that now. I've adjusted my answer to handle that more general situation. – Kyle Parsons Dec 18 '19 at 15:25
Kyle, as far as I can tell you provided a smart way to compute the sums, but if you read carefully my question you should recognize that **I'm interested in an efficient way of computing `p`**, so no, I'm sorry but your answer is not what I'm looking for. – gboffi Dec 18 '19 at 16:11

Riccardo Bucco · Answer 2 · 2019-12-18T16:49:45.747

1

The complexity can be much lower using combinations:

from itertools import combinations
import numpy as np

z = 6 # number of zeros
p = 5 # number of 1s
n = 4 # number of -1s

indexes = set(range(z+p+n))

res = np.array([(*plus, *minus, *(zero for zero in indexes - set(plus) - set(minus)))
                for plus in combinations(indexes, p)
                for minus in combinations(indexes-set(plus), n)])

perm = np.zeros((res.shape[0], z+p+n), dtype=np.int8)
perm[np.arange(res[:,:p].shape[0])[:,None], res[:,:p]] = 1
perm[np.arange(res[:,p:p+n].shape[0])[:,None], res[:,p:p+n]] = -1

Basically, instead of computing all the permutations (complexity (n+p+z)!), I first select the positive elements (complexity (n+p+z)!/(p!(n+z)!)) and then select the negative ones (complexity (n+z)!/(n!z!)). The resulting complexity should be something like (n+p+z)!/(n!p!z!)

edited Dec 18 '19 at 16:49

answered Dec 18 '19 at 15:16

Riccardo Bucco

13,980
4
22
50

Riccardo, my fault but I do not see how I should apply your method to my problem. Could you please apply it to, e.g., `[-1]*2+[0]*3+[1]*2` ? – gboffi Dec 18 '19 at 15:35
@gboffi I made an example. Check if it works for you. I can also explain you why this is faster than your approach – Riccardo Bucco Dec 18 '19 at 15:41
I'll repeat myself: probably my fault but I do not see how I should apply your method to my problem. – gboffi Dec 18 '19 at 15:43
@gboffi You need to compute all the sums in an efficient way, don't you? Well, this is what my algorithm does. What is not clear? – Riccardo Bucco Dec 18 '19 at 15:46
_My question:_ is it possible to obtain, e.g., `p = …` without computing all the 1,307,674,368,000 (mostly identical) permutations? IOW, I'm interested in `p` and not in the sums — you are correct that possibly we have better ways of computing the sums **but I'm interested in `p`** – gboffi Dec 18 '19 at 15:50
@gboffi Ok, I fivex the answer to find what you have asked for. It's still performing better than your solution – Riccardo Bucco Dec 18 '19 at 16:38
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/204532/discussion-between-riccardo-bucco-and-gboffi). – Riccardo Bucco Dec 19 '19 at 07:52

permutations of identical elements: efficiently avoid redundant permutations

2 Answers2