14

Suppose we have n bins in which we are throwing k balls. What is a fast (i.e. using numpy/scipy instead of python code) way to generate all possible outcomes as a matrix?

For example, if n = 4 and k = 3, we'd want the following numpy.array:

3 0 0 0
2 1 0 0
2 0 1 0
2 0 0 1
1 2 0 0
1 1 1 0
1 1 0 1
1 0 2 0
1 0 1 1
1 0 0 2
0 3 0 0
0 2 1 0
0 2 0 1
0 1 2 0
0 1 1 1
0 1 0 2
0 0 3 0
0 0 2 1
0 0 1 2
0 0 0 3

Apologies if any permutation was missed, but this is the general idea. The generated permutations don't have to be in any particular order, but the above list was convenient for categorically iterating through them mentally.

Better yet, is there a way to map every integer from 1 to the multiset number (the cardinality of this list) directly to a given permutation?

This question is related to the following ones, which are implemented in R with very different facilities:

Also related references:

Community
  • 1
  • 1
Andrew Mao
  • 35,740
  • 23
  • 143
  • 224
  • Does it need to be in that order? – Alyssa Haroldsen Jun 08 '16 at 20:03
  • @Kupiakos nope. And I didn't realize, the person that posted that first question made the same list. – Andrew Mao Jun 08 '16 at 20:04
  • 1
    Thinking in terms of [stars and bars](https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)), use an algorithm for "unranking combinations" to find the locations of the bars (or stars). One such algorithm is outlined here: [Finding the k-combination for a given number](https://en.wikipedia.org/wiki/Combinatorial_number_system#Finding_the_k-combination_for_a_given_number). –  Jun 09 '16 at 08:36
  • Thanks @morningsun. I'd implement it myself, but it would be good to first know if something similar already exists in the Python world. – Andrew Mao Jun 09 '16 at 15:58
  • These are [weak integer compositions](https://en.wikipedia.org/wiki/Composition_(combinatorics)). I'm not aware of anything out-of-the-box in the Python world. – Joseph Wood Dec 22 '22 at 02:31

4 Answers4

3

Here's a generator solution using itertools.combinations_with_replacement, don't know if it will be suitable for your needs.

def partitions(n, b):
    masks = numpy.identity(b, dtype=int)
    for c in itertools.combinations_with_replacement(masks, n): 
        yield sum(c)

output = numpy.array(list(partitions(3, 4)))
# [[3 0 0 0]
#  [2 1 0 0]
#  ...
#  [0 0 1 2]
#  [0 0 0 3]]

The complexity of this function grows exponentially, so there is a discrete boundary between what is feasible and what is not.

Note that while numpy arrays need to know their size at construction, this is easily possible since the multiset number is easily found. Below might be a better method, I have done no timings.

from math import factorial as fact
from itertools import combinations_with_replacement as cwr

nCr = lambda n, r: fact(n) / fact(n-r) / fact(r)

def partitions(n, b):
    partition_array = numpy.empty((nCr(n+b-1, b-1), b), dtype=int)
    masks = numpy.identity(b, dtype=int)
    for i, c in enumerate(cwr(masks, n)): 
        partition_array[i,:] = sum(c)
    return partition_array
Jared Goguen
  • 8,772
  • 2
  • 18
  • 36
  • Nice answer, +1. I believe the initialization size of the `partition_array` in the second snippet is wrong, should be n+b-1 choose b-1, no? See https://en.wikipedia.org/wiki/Multinomial_distribution#Properties. – Ian Hincks Jul 08 '16 at 16:14
  • @IanHincks That would make sense... I hope... updated accordingly. – Jared Goguen Jul 08 '16 at 17:59
2

For reference purposes, the following code uses Ehrlich's algorithm to iterate through all possible combinations of a multiset in C++, Javascript, and Python:

https://github.com/ekg/multichoose

This can be converted to the above format using this method. Specifically,

for s in multichoose(k, set):
    row = np.bincount(s, minlength=len(set) + 1)

This still isn't pure numpy, but can be used to fill a preallocated numpy.array pretty quickly.

Andrew Mao
  • 35,740
  • 23
  • 143
  • 224
1

here is a naive implementation with list comprehensions, not sure about performance compared to numpy

def gen(n,k):
    if(k==1):
        return [[n]]
    if(n==0):
        return [[0]*k]
    return [ g2 for x in range(n+1) for g2 in [ u+[n-x] for u in gen(x,k-1) ] ]

> gen(3,4)
[[0, 0, 0, 3],
 [0, 0, 1, 2],
 [0, 1, 0, 2],
 [1, 0, 0, 2],
 [0, 0, 2, 1],
 [0, 1, 1, 1],
 [1, 0, 1, 1],
 [0, 2, 0, 1],
 [1, 1, 0, 1],
 [2, 0, 0, 1],
 [0, 0, 3, 0],
 [0, 1, 2, 0],
 [1, 0, 2, 0],
 [0, 2, 1, 0],
 [1, 1, 1, 0],
 [2, 0, 1, 0],
 [0, 3, 0, 0],
 [1, 2, 0, 0],
 [2, 1, 0, 0],
 [3, 0, 0, 0]]
karakfa
  • 66,216
  • 7
  • 41
  • 56
  • I did the tests and if you use numba this answer is an order of magnitude faster than the other answers – Shep Bryan Mar 01 '22 at 18:46
  • 1
    if you import numba as nb then add " @nb.njit(cache=True)" to the line above "def gen..." this is incredibly fast. – Shep Bryan Mar 01 '22 at 18:51
0

Here's the solution I came up with for this.

import numpy, itertools
def multinomial_combinations(n, k, max_power=None):
    """returns a list (2d numpy array) of all length k sequences of 
    non-negative integers n1, ..., nk such that n1 + ... + nk = n."""
    bar_placements = itertools.combinations(range(1, n+k), k-1)
    tmp = [(0,) + x + (n+k,) for x in bar_placements]
    sequences =  numpy.diff(tmp) - 1
    if max_power:
        return sequences[numpy.where((sequences<=max_power).all(axis=1))][::-1]
    else:
        return sequences[::-1]

Note 1: The [::-1] at the end just reverses the order to match your example output.

Note 2: Finding these sequences is equivalent to finding all ways to arrange n stars and k-1 bars in (to fill n+k-1 spots) (see stars and bars thm 2).

Note 3: The max_power argument is to give you the option to return only sequences where all integers are below some max.

mathandy
  • 1,892
  • 25
  • 32