Generate random numbers summing to a predefined value

Question

So here is the deal: I want to (for example) generate 4 pseudo-random numbers, that when added together would equal 40. How could this be dome in python? I could generate a random number 1-40, then generate another number between 1 and the remainder,etc, but then the first number would have a greater chance of "grabbing" more.

score 119 · Answer 1 · edited May 22 '19 at 13:55

Here's the standard solution. It's similar to Laurence Gonsalves' answer, but has two advantages over that answer.

It's uniform: each combination of 4 positive integers adding up to 40 is equally likely to come up with this scheme.

and

it's easy to adapt to other totals (7 numbers adding up to 100, etc.)

import random

def constrained_sum_sample_pos(n, total):
    """Return a randomly chosen list of n positive integers summing to total.
    Each such list is equally likely to occur."""

    dividers = sorted(random.sample(range(1, total), n - 1))
    return [a - b for a, b in zip(dividers + [total], [0] + dividers)]

Sample outputs:

>>> constrained_sum_sample_pos(4, 40)
[4, 4, 25, 7]
>>> constrained_sum_sample_pos(4, 40)
[9, 6, 5, 20]
>>> constrained_sum_sample_pos(4, 40)
[11, 2, 15, 12]
>>> constrained_sum_sample_pos(4, 40)
[24, 8, 3, 5]

Explanation: there's a one-to-one correspondence between (1) 4-tuples (a, b, c, d) of positive integers such that a + b + c + d == 40, and (2) triples of integers (e, f, g) with 0 < e < f < g < 40, and it's easy to produce the latter using random.sample. The correspondence is given by (e, f, g) = (a, a + b, a + b + c) in one direction, and (a, b, c, d) = (e, f - e, g - f, 40 - g) in the reverse direction.

If you want nonnegative integers (i.e., allowing 0) instead of positive ones, then there's an easy transformation: if (a, b, c, d) are nonnegative integers summing to 40 then (a+1, b+1, c+1, d+1) are positive integers summing to 44, and vice versa. Using this idea, we have:

def constrained_sum_sample_nonneg(n, total):
    """Return a randomly chosen list of n nonnegative integers summing to total.
    Each such list is equally likely to occur."""

    return [x - 1 for x in constrained_sum_sample_pos(n, total + n)]

Graphical illustration of constrained_sum_sample_pos(4, 10), thanks to @FM. (Edited slightly.)

0 1 2 3 4 5 6 7 8 9 10  # The universe.
|                    |  # Place fixed dividers at 0, 10.
|   |     |       |  |  # Add 4 - 1 randomly chosen dividers in [1, 9]
  a    b      c    d    # Compute the 4 differences: 2 3 4 1

+1 This was informative -- thanks. I edited your answer, adding a graphical illustration that helped me figure out the algorithm. Normally, I would be reluctant to do that, but I thought others might find it useful. Feel free to alter or undo my edit. — FMc, Aug 28 '10 at 14:02
Captain, I am detecting large amounts of win in this sector! +1 — Jim Lewis, Aug 28 '10 at 17:28
@FM: Thanks; nice addition. I did edit it slightly to suit my 0-based view of the universe; I hope this doesn't affect the clarity. — Mark Dickinson, Aug 28 '10 at 18:37
If you need to constraint the generated integers to be higher than a certain value ``low``, this can be done by replacing ``a - b`` with ``a - b + (low-1)`` and to compensate for the ``n*(low-1)`` increase in the new sum by replacing the two instances of ``total`` with ``total - (min-1)*n``. I haven't figured out a way to add a ``high`` threshold too. — Jonas Lindeløv, Aug 15 '15 at 20:56
This answer is so good and well-written. I don't understand why this is not the accepted answer. I think many people are looking for this answer. — Shaun Han, Mar 05 '21 at 11:12
To allow zeros in the set you could use `random.choices()` instead of `sample()` — charrison, Mar 25 '23 at 21:24

Ruggero Turra · Answer 2 · 2021-11-26T21:45:14.393

21

Use multinomial distribution

from numpy.random import multinomial
multinomial(40, [1/4.] * 4)

Each variable will be distributed as a binomial distribution with mean n * p equal to 40 * 1/4 = 10 in this example.

edited Nov 26 '21 at 21:45

answered Jan 25 '16 at 17:25

Ruggero Turra

16,929
16
85
141

3

Clearly the cleanest and sturdiest solution, but probably a bit more explanation in the answer would have helped the OP understand _why_ this is the best answer – Hamman Samuel Jul 19 '19 at 16:32
This seems to produce values close to equal, rather than arbitrary ones in the desired range: `multinomial(2**16, [1/3] * 3)/2**16` -> `array([0.33073425, 0.33273315, 0.33653259])` (multiple runs give similar results). Doesn't look uniform to me – kram1032 Nov 22 '21 at 15:35
That wasn't my complaint. The sum is indeed correct. The issue is with the uniformity of the samples. They will hover closely to just evenly split intervals, rather than sometimes giving a few much larger or smaller ones. The top-voted answer does manage that. – kram1032 Nov 22 '21 at 16:06
@kram1032, they are not uniform, they are binomial, with mean `n * p` which is `1/3 * 2**16 ~ 21k` in your case. OP didn't ask for uniformity. – Ruggero Turra Nov 22 '21 at 16:10
Ah fair enough, indeed, uniformity wasn't asked for. – kram1032 Nov 22 '21 at 16:11

Laurence Gonsalves · Accepted Answer · 2010-08-28T21:21:47.217

12

b = random.randint(2, 38)
a = random.randint(1, b - 1)
c = random.randint(b + 1, 39)
return [a, b - a, c - b, 40 - c]

(I assume you wanted integers since you said "1-40", but this could be easily generalized for floats.)

Here's how it works:

cut the total range in two randomly, that's b. The odd range is because there are going to be at least 2 below the midpoint and at least 2 above. (This comes from your 1 minimum on each value).
cut each of those ranges in two randomly. Again, the bounds are to account for the 1 minimum.
return the size of each slice. They'll add up to 40.

edited Aug 28 '10 at 21:21

answered Aug 28 '10 at 02:54

Laurence Gonsalves

137,896
35
246
299

3

I think you need `a = random.randint(1, b-1)` and `c = random.randint(b+1, 39)` to make sure that you don't get zeros in the output list. Also, this has a slightly peculiar distribution: results of the form `[1, 1, x, 38-x]` are significantly more likely to occur than they would be for a uniform distribution. – Mark Dickinson Aug 28 '10 at 13:06
@Mark: I believe you are correct. I had a couple of off by one errors in there. – Laurence Gonsalves Aug 28 '10 at 22:03

score 8 · Answer 4 · answered Aug 28 '10 at 02:43

8

Generate 4 random numbers, compute their sum, divide each one by the sum and multiply by 40.

If you want Integers, then this will require a little non-randomness.

answered Aug 28 '10 at 02:43

lalli

6,083
7
42
55

This will create a non-uniform distribution https://stackoverflow.com/a/8068956/2075003 – n1000 Jul 20 '18 at 17:04

Jim Lewis · Answer 5 · 2010-08-28T03:11:26.800

3

There are only 37^4 = 1,874,161 arrangements of four integers in the range [1,37] (with repeats allowed). Enumerate them, saving and counting the permutations that add up to 40. (This will be a much smaller number, N).

Draw uniformly distributed random integers K in the interval [0, N-1] and return the K-th permutation. This can easily be seen to guarantee a uniform distribution over the space of possible outcomes, with each sequence position identically distributed. (Many of the answers I'm seeing will have the final choice biased lower than the first three!)

edited Aug 28 '10 at 03:11

answered Aug 28 '10 at 03:04

Jim Lewis

43,505
7
82
96

This answer does not generalize very well. – Richard Jan 15 '21 at 23:17

score 1 · Answer 6 · answered Mar 25 '23 at 22:17

Here's a humble improvement to @Mark Dickinson's version to allow zeros in the generated integers (so that they are non-negative, rather than positive):

import random

def constrained_sum_sample_pos(n, total):
    """Return a randomly chosen list of n non-negative integers summing to total.
    Each such list is equally likely to occur."""

    dividers = sorted(random.choices(range(0, total), k=n-1))
    return [a - b for a, b in zip(dividers + [total], [0] + dividers)]

The function random.choices() samples with replacement, as opposed to random.sample() which samples without replacement. It's new from Python 3.6.

Interesting! The OP didn't specify what distribution they wanted, but it's worth noting that in this solution the distribution can be a bit odd, especially for small `n`. For example, with `n = 3` and `total = 2`, there are six solutions: (`[0, 0, 2], [0, 2, 0], [2, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 0]`). It looks as though this code would give `[0, 1, 1]` approximately half the time and `[0, 0, 2]` and `[1, 0, 1]` one quarter of the time (each). — Mark Dickinson, Mar 26 '23 at 09:05

score 0 · Answer 7 · answered Jul 29 '20 at 16:45

If you want true randomness then use:

import numpy as np
def randofsum_unbalanced(s, n):
    # Where s = sum (e.g. 40 in your case) and n is the output array length (e.g. 4 in your case)
    r = np.random.rand(n)
    a = np.array(np.round((r/np.sum(r))*s,0),dtype=int)
    while np.sum(a) > s:
        a[np.random.choice(n)] -= 1
    while np.sum(a) < s:
        a[np.random.choice(n)] += 1
    return a

If you want a greater level of uniformity then take advantage of the multinomial distribution:

def randofsum_balanced(s, n):
    return np.random.multinomial(s,np.ones(n)/n,size=1)[0]

John Mee · Answer 8 · 2020-02-28T01:09:57.560

Building on @markdickonson by providing some control over distribution between the divisors. I introduce a variance/jiggle as a percentage of the uniform distance between each.

 def constrained_sum_sample(n, total, variance=50):
    """Return a random-ish list of n positive integers summing to total.

    variance: int; percentage of the gap between the uniform spacing to vary the result.
    """
    divisor = total/n
    jiggle = divisor * variance / 100 / 2
    dividers = [int((x+1)*divisor + random.random()*jiggle) for x in range(n-1)]
    result = [a - b for a, b in zip(dividers + [total], [0] + dividers)]
    return result

Sample output:

[12, 8, 10, 10]
[10, 11, 10, 9]
[11, 9, 11, 9]
[11, 9, 12, 8]

The idea remains to divide the population equally, then randomly move them left or right within the given range. Since each value is still bound to the uniform point we don't have to worry about it drifting.

Good enough for my purposes, but not perfect. eg: the first number will always vary higher, and the last will always vary lower.

Generate random numbers summing to a predefined value

8 Answers8

Linked

Related