7

In some code I want to choose n random numbers in [0,1) which sum to 1.

I do so by choosing the numbers independently in [0,1) and normalizing them by dividing each one by the total sum:

numbers = [random() for i in range(n)]
numbers = [n/sum(numbers) for n in numbers]

My "problem" is, that the distribution I get out is quite skew. Choosing a million numbers not a single one gets over 1/2. By some effort I've calculated the pdf, and it's not nice.

Here is the weird looking pdf I get for 5 variables:

enter image description here

Do you have an idea for a nice algorithm to choose the numbers, that result in a more uniform or simple distribution?

Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
  • 5
    I'm not sure I understand, if you break the number 1 into a million random pieces there _shouldn't_ be one that's over 0.5 . If there was, that would mean the other 999,999 would have to fit in the other half. – DShook Apr 11 '11 at 14:23
  • possible duplicate of [Generate multiple random numbers to equal a value in python](http://stackoverflow.com/questions/3589214/generate-multiple-random-numbers-to-equal-a-value-in-python) – BlueRaja - Danny Pflughoeft Apr 11 '11 at 16:39
  • Take a look at [Random Vectors with Fixed Sum](http://www.mathworks.com/matlabcentral/fileexchange/9700-random-vectors-with-fixed-sum). The download link leads to a file with MATLAB code and a document explaining the algorithm. – NPE Apr 11 '11 at 14:26

3 Answers3

19

You are looking to partition the distance from 0 to 1.

Choose n - 1 numbers from 0 to 1, sort them and determine the distances between each of them.

This will partition the space 0 to 1, which should yield the occasional large result which you aren't getting.

Even so, for large values of n, you can generally expect your max value to decrease as well, just not as quickly as your method.

LanceH
  • 1,726
  • 13
  • 20
  • 1
    A lovely algorithm. Do you know what distribution this may result in? – Thomas Ahle Apr 11 '11 at 20:50
  • Other than calling it a "random partition", I don't know a way to refer to it. I've always viewed it from the partitioning side of things, not from the distribution of the segment lengths. – LanceH Apr 12 '11 at 15:24
  • I've deduced the cdf `1-(1-x)^n` and pmf `n(1-x)^(n-1)`. The distribution seams to have a higher probability of small numbers (it doesn't have the peak near 1/n) than mine, so it probably also has more large numbers. I haven't compared it with the Dirichlet distribution yet. – Thomas Ahle Apr 12 '11 at 21:56
  • 2
    This is the beta distribution, of which the simplest case is the probability distribution of the minimum element of n uniforms. – Jérémie May 09 '11 at 02:29
6

You might be interested in the Dirichlet distribution which is used for generate quantities that sum to 1 if you're looking for probabilities. There's also a section on how to generate them using gamma distributions here.

job
  • 9,003
  • 7
  • 41
  • 50
  • You generally need some distribution which other than uniform whence to draw your numbers. As job's answer suggests, you can use the Gamma distribution with alpha < 1 to get "peakier" results. Doing so will give you a draw from the Dirichlet distribution which is convenient since it's the conjugate prior of the multinomial you seek. – Jonathan Chang Apr 11 '11 at 15:34
  • The article has a nice "drawing" section, which I've added some code examples to. I'm not really sure if it matters what the parameters are, as long as they are equal though? – Thomas Ahle Apr 12 '11 at 21:58
0

Another way to get n random numbers which sum up to 1:

import random


def create_norm_arr(n, remaining=1.0):
    random_numbers = []
    for _ in range(n - 1):
        r = random.random()  # get a random number in [0, 1)
        r = r * remaining
        remaining -= r
        random_numbers.append(r)
    random_numbers.append(remaining)
    return random_numbers

random_numbers = create_norm_arr(5)
print(random_numbers)
print(sum(random_numbers))

This makes higher numbers more likely.

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958