This problem and the different solutions proposed intrigued me. I did a little test of the three basic algorithms suggested and what average values they would yield for the numbers generated.
choose_one_and_divide_rest
means: [ 0.49999212 0.24982403 0.25018384]
standard deviations: [ 0.28849948 0.22032758 0.22049302]
time needed to fill array of size 1000000 was 26.874945879 seconds
choose_two_points_and_use_intervals
means: [ 0.33301421 0.33392816 0.33305763]
standard deviations: [ 0.23565652 0.23579615 0.23554689]
time needed to fill array of size 1000000 was 28.8600130081 seconds
choose_three_and_normalize
means: [ 0.33334531 0.33336692 0.33328777]
standard deviations: [ 0.17964206 0.17974085 0.17968462]
time needed to fill array of size 1000000 was 27.4301018715 seconds
The time measurements are to be taken with a grain of salt as they might be more influenced by the Python memory management than by the algorithm itself. I'm too lazy to do it properly with timeit
. I did this on 1GHz Atom so that explains why it took so long.
Anyway, choose_one_and_divide_rest is the algorithm suggested by Andrie and the poster of the question him/herself (AND): you choose one value a in [0,1], then one in [a,1] and then you look what you have left. It adds up to one but that's about it, the first division is twice as large as the other two. One might have guessed as much ...
choose_two_points_and_use_intervals is the accepted answer by ddzialak (DDZ). It takes two points in the interval [0,1] and uses the size of the three sub-intervals created by these points as the three numbers. Works like a charm and the means are all 1/3.
choose_three_and_normalize is the solution by Anders Gustafsson and Josh O'Brien (JOB). It just generates three numbers in [0,1] and normalizes them back to a sum of 1. Works just as well and surprisingly a little bit faster in my Python implementation. The variance is a bit lower than for the second solution.
There you have it. No idea to what beta distribution these solutions correspond or which set of parameters in the corresponding paper I referred to in a comment but maybe someone else can figure that out.