9

Creating a random vector whose sum is X (e.g. X=1000) is fairly straight forward:

import random
def RunFloat():
    Scalar = 1000
    VectorSize = 30
    RandomVector = [random.random() for i in range(VectorSize)]
    RandomVectorSum = sum(RandomVector)
    RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
    return RandomVector
RunFloat()

The code above create a vector whose values are floats and sum is 1000.

I'm having difficulty creating a simple function for creating a vector whose values are integers and sum is X (e.g. X=1000*30)

import random
def RunInt():
    LowerBound = 600
    UpperBound = 1200
    VectorSize = 30
    RandomVector = [random.randint(LowerBound,UpperBound) for i in range(VectorSize)]
    RandomVectorSum = 1000*30
    #Sanity check that our RandomVectorSum is sensible/feasible
    if LowerBound*VectorSize <= RandomVectorSum and RandomVectorSum <= UpperBound*VectorSum:
        if sum(RandomVector) == RandomVectorSum:
            return RandomVector
        else:
            RunInt()  

Does anyone have any suggestions to improve on this idea? My code might never finish or run into recursion depth problems.

Edit (July 9, 2012)

Thanks to Oliver, mgilson, and Dougal for their inputs. My solution is shown below.

  1. Oliver was very creative with the multinomial distribution idea
  2. Put simply, (1) is very likely to output certain solutions more so than others. Dougal demonstrated that the multinomial solution space distribution is not uniform or normal by a simple test/counter example of Law of Large Numbers. Dougal also suggested to use numpy's multinomial function which saves me a lot of trouble, pain, and headaches.
  3. To overcome (2)'s output issue, I use RunFloat() to give what appears (I haven't tested this so its just a superficial appearance) to be a more uniform distribution. How much of a difference does this make compared to (1)? I don't really know off-hand. It's good enough for my use though.
  4. Thanks again to mgilson for the alternative method that does not use numpy.

Here is the code that I have made for this edit:

Edit #2 (July 11,2012)

I realized that the normal distribution is not correctly implemented, I have since modified it to the following:

import random
def RandFloats(Size):
    Scalar = 1.0
    VectorSize = Size
    RandomVector = [random.random() for i in range(VectorSize)]
    RandomVectorSum = sum(RandomVector)
    RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
    return RandomVector

from numpy.random import multinomial
import math
def RandIntVec(ListSize, ListSumValue, Distribution='Normal'):
    """
    Inputs:
    ListSize = the size of the list to return
    ListSumValue = The sum of list values
    Distribution = can be 'uniform' for uniform distribution, 'normal' for a normal distribution ~ N(0,1) with +/- 5 sigma  (default), or a list of size 'ListSize' or 'ListSize - 1' for an empirical (arbitrary) distribution. Probabilities of each of the p different outcomes. These should sum to 1 (however, the last element is always assumed to account for the remaining probability, as long as sum(pvals[:-1]) <= 1).  
    Output:
    A list of random integers of length 'ListSize' whose sum is 'ListSumValue'.
    """
    if type(Distribution) == list:
        DistributionSize = len(Distribution)
        if ListSize == DistributionSize or (ListSize-1) == DistributionSize:
            Values = multinomial(ListSumValue,Distribution,size=1)
            OutputValue = Values[0]
    elif Distribution.lower() == 'uniform': #I do not recommend this!!!! I see that it is not as random (at least on my computer) as I had hoped
        UniformDistro = [1/ListSize for i in range(ListSize)]
        Values = multinomial(ListSumValue,UniformDistro,size=1)
        OutputValue = Values[0]
    elif Distribution.lower() == 'normal':
        """
        Normal Distribution Construction....It's very flexible and hideous
        Assume a +-3 sigma range.  Warning, this may or may not be a suitable range for your implementation!
        If one wishes to explore a different range, then changes the LowSigma and HighSigma values
        """
        LowSigma    = -3#-3 sigma
        HighSigma   = 3#+3 sigma
        StepSize    = 1/(float(ListSize) - 1)
        ZValues     = [(LowSigma * (1-i*StepSize) +(i*StepSize)*HighSigma) for i in range(int(ListSize))]
        #Construction parameters for N(Mean,Variance) - Default is N(0,1)
        Mean        = 0
        Var         = 1
        #NormalDistro= [self.NormalDistributionFunction(Mean, Var, x) for x in ZValues]
        NormalDistro= list()
        for i in range(len(ZValues)):
            if i==0:
                ERFCVAL = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
                NormalDistro.append(ERFCVAL)
            elif i ==  len(ZValues) - 1:
                ERFCVAL = NormalDistro[0]
                NormalDistro.append(ERFCVAL)
            else:
                ERFCVAL1 = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
                ERFCVAL2 = 0.5 * math.erfc(-ZValues[i-1]/math.sqrt(2))
                ERFCVAL = ERFCVAL1 - ERFCVAL2
                NormalDistro.append(ERFCVAL)  
            #print "Normal Distribution sum = %f"%sum(NormalDistro)
            Values = multinomial(ListSumValue,NormalDistro,size=1)
            OutputValue = Values[0]
        else:
            raise ValueError ('Cannot create desired vector')
        return OutputValue
    else:
        raise ValueError ('Cannot create desired vector')
    return OutputValue
#Some Examples        
ListSize = 4
ListSumValue = 12
for i in range(100):
    print RandIntVec(ListSize, ListSumValue,Distribution=RandFloats(ListSize))

The code above can be found on github. It is part of a class I built for school. user1149913, also posted a nice explanation of the problem.

tmthydvnprt
  • 10,398
  • 8
  • 52
  • 72
torrho
  • 1,823
  • 4
  • 16
  • 21
  • 3
    You may have a problem if you want to sum many float numbers and get an exact value... – JBernardo Jul 08 '12 at 03:54
  • 1
    possible duplicate of [Generate multiple random numbers to equal a value in python](http://stackoverflow.com/questions/3589214/generate-multiple-random-numbers-to-equal-a-value-in-python) – finnw Jul 08 '12 at 09:40
  • After running into a related problem, this would argue that this question is related to asking "how to randomly select a partition of n whose size is X". [related wiki](http://en.wikipedia.org/wiki/Partition_(number_theory)) – torrho Mar 02 '14 at 20:50

7 Answers7

4

I would suggest not doing this recursively:

When you sample recursively, the value from the first index has a much greater possible range, whereas values in subsequent indices will be constrained by the first value. This will yield something resembling an exponential distribution.

Instead, what I'd recommend is sampling from the multinomial distribution. This will treat each index equally, constrain the sum, force all values to be integers, and sample uniformly from all possible configurations that follow these rules (note: configurations that can happen multiple ways will be weighted by the number of ways that they can occur).

To help merge your question with the multinomial notation, total sum is n (an integer), and so each of the k values (one for each index, also integers) must be between 0 and n. Then follow the recipe here.

(Or use numpy.random.multinomial as @Dougal helpfully suggested).

Stef
  • 13,242
  • 2
  • 17
  • 28
user
  • 7,123
  • 7
  • 48
  • 90
  • 3
    If numpy is available, [`numpy.random.multinomial`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.multinomial.html) also implements this. – Danica Jul 08 '12 at 04:03
  • 2
    @user1245262 Dirichlet has values in `[0,1]^n` that sum to 1. OP wants integers that sum to X. Multinominal has the right support, though who knows if it's the distribution OP wants. – Danica Jul 08 '12 at 04:55
  • @Dougal - But doesn't multinomial only insist that the sum of the number of instances of each outcome be N, rather than the sum of the outcomes be N. Dirichilet requires that the sum of the N outcomes is 1, but it shouldn't be too hard to generalize that, if only by multiplying all the outcomes by a constant factor and then adjusting for rounding... unless I'm missing something here... – user1245262 Jul 08 '12 at 05:01
  • @user1245262 Oh, good point. This isn't multinomial after all. Dirichlet would work by multiplying, but rounding would actually probably be hard unless `N` is very large... – Danica Jul 08 '12 at 05:06
  • @user1245262 Dougal is correct. Multinomial is the generalization of the binomial; you start with N balls and partition them among K urns. The only difference between a Dirichlet and a multinomial is that the support of the Dirichlet (both in N and the number of balls in each bin) are the reals, whereas the multinomial is the natural numbers. Since the OP requests integers, I would vote for multinomial over Dirichlet. But regardless, these (*i.e.* multinomial and Dirichlet) are the correct family of solution. – user Jul 08 '12 at 05:06
  • You would not multiply anything you are simply partitioning balls into urns. Thus, you would simply sample from the multinomial distribution as specified by Wikipedia or using another algorithm (such as the one defined in `numpy.random.multinomial`). – user Jul 08 '12 at 05:08
  • @Oliver The problem is that multinomial treats `1 2 3` as adding up to 3, not 6. – Danica Jul 08 '12 at 05:09
  • @Dougal. Incorrect. Using the Wikipedia notation, you have k=3, n=6. x1 = 1, x2 = 2, x3 = 3. – user Jul 08 '12 at 05:10
  • @user1245262 "But doesn't multinomial only insist that the sum of the number of instances of each outcome be N, rather than the sum of the outcomes be N." I see what you mean, but you're incorrectly conflating N (the sum of trials) and K (the number of bins). – user Jul 08 '12 at 05:13
  • @Oliver Well, I guess I'll just have to review. If I wanted the distribution of outcomes for N discrete independent indentically distributed random events, I would use a multinomial distribution. If I want to add the condition that the sum of those N events is a known number, I would use some variation of a Dirichilet distribution. As I'm reading this question, the known vector size corresponds to the N events, but there is a desire that their sum be set to a particular number. – user1245262 Jul 08 '12 at 05:17
  • @user1245262 Both distributions have the property that you're discussing. The only difference between a multinomial and a Dirichlet is the support. The Dirichlet allows fractional numbers of balls in a bin (as well as fractional sums). Look at the probability density function; they are identical, except Dirichlet uses Γ(x+1) while multinomial uses x!. For integer values, these are equal; the only difference is that Γ is defined for reals. – user Jul 08 '12 at 05:21
  • @Oliver Argh, I was misunderstanding how you were defining the multinomial. You're right, of course. When we want a list of sum `N` we're just distributing balls of value 1 into each of the list element bins. – Danica Jul 08 '12 at 05:25
  • @Oliver But in the multinom., the support vars represent the # of times each outcome occurs. In the Dirich., they represent the value of each outcome. For the distribution of outcomes of rolling 3 6-sided dice, I'd use a multinom. distrib. For the distribution of outcomes for rolling 3 6-sided dice with a total of 12, I'd use a variant of the Dirich. distrib. The support of an N-dim. Dirich. distrib. is a N-1 dim. simplex. This will enforce the limitation on the sum of the N events. A multinom. distrib only requires that there are exactly N events. There are no restrictions on their sum. – user1245262 Jul 08 '12 at 05:45
  • @user1245262 Re-read the last paragraph of the answer; the variables aren't what you'd typically think of as a multinomial. Here the bins are each element of the vector, and the values are the number of times each element was incremented. The total of the vector sums to `N` because there are `N` events. – Danica Jul 08 '12 at 05:47
  • In terms of implementation, note that Wikipedia's method above and I think also numpy's method (though I'm having trouble finding the source) take time linear in `K`, the length of the vector, while it's actually possible to generate these in time *independent* of `K` using the [alias method](http://en.wikipedia.org/wiki/Alias_method) ([orig paper](http://web.eecs.utk.edu/~vose/Publications/random.pdf)), after linear setup time. If you're making a lot of large vectors, this is worth it. Here's a [nice, though lengthy, intro to the alias method](http://www.keithschwarz.com/darts-dice-coins/). – Danica Jul 08 '12 at 05:54
  • Wow that was a great and simple way of doing it! This is great...I'll edit my post a solution very shortly... – torrho Jul 08 '12 at 05:57
  • @user1245262 "For the distribution of outcomes for rolling 3 6-sided dice with a total of 12, I'd use a variant of the Dirich. distrib". I know what you mean, but consider it as a likelihood rather than a probability (*i.e.* swap the parameters and random variables). Since 1) counts are integers and 2) the total (12) is an integer, then **the Dirichlet is equivalent to the multinomial**. The probability density functions are simply equivalent in that case. **If you don't want to consider real-valued counts, the Dirichlet is not appropriate.** – user Jul 08 '12 at 06:00
  • @Dougal - It looks as though he has 30 dice (vector elements). Each die can take integral values from 600 to 1200. He wants the distribution of rolls (vectors) that give a total (vector sum) of 30,000. Am I misinterpreting something here? I'll check this thread again in the morning, I'm starting to fade here.... – user1245262 Jul 08 '12 at 06:02
  • I don't think this is the appropriate venue to discuss further, especially since it is marked as accepted. – user Jul 08 '12 at 06:05
  • Note that this approach doesn't actually "sample uniformly from all possible configurations that follow these rules" as claimed. For a length-3 vector with sum 10, `3 3 4` is 4200 times more likely than `0 0 10`. – Danica Jul 08 '12 at 06:32
  • @Dougal I edited to clarify; it depends what you count as equivalent. Some *combinations* will be weighted more highly, but all *permuatations* (many of which may lead to the same *combination*) will be given equal weight. I believe this is what the OP is requesting. – user Jul 08 '12 at 06:37
  • @Oliver Of course, but the permutations are irrelevant to the OP's original problem and are an artifact of this formulation. – Danica Jul 08 '12 at 06:38
  • @user1245262 I only used an upper and lower bound value just for flexibility. – torrho Jul 08 '12 at 07:33
  • @Dougal Thank you for the Numpy idea. That saves me a few headaches. – torrho Jul 08 '12 at 07:33
3

I just ran both @Oliver's multinomial approach and @mgilson's code a million times each, for a length-3 vector summing to 10, and looked at the number of times each possible outcome came up. Both are extremely nonuniform:

(I'm about to show the indexing approach.)

Does this matter? Depends on whether you want "an arbitrary vector with this property that's usually different each time" vs each valid vector being equally likely.

In the multinomial approach, of course 3 3 4 is going to be much more likely than 0 0 10 (4200 times more likely, as it turns out). mgilson's biases are less obvious to me, but 0 0 10 and its permutations were the least likely by far (only ~750 times each out of a million); the most common were 1 4 5 and its permutations; not sure why, but they were certainly the most common, followed by 1 3 6. It'll typically start with a sum that's too high in this configuration (expectation 15), though I'm not sure why the reduction works out that way....

One way to get a uniform output over the possible vectors would be a rejection scheme. To get a vector of length K with sum N, you'd:

  1. Sample a vector of length K with integer elements uniformly and independently between 0 and N.
  2. Repeat until the sum of the vector is N.

Obviously this is going to be extremely slow for non-tiny K and N.

Another approach would be to assign a numbering to all the possible vectors; there are (N + K - 1) choose (K - 1) such vectors, so just choose a random integer in that range to decide which one you want. One reasonable way to number them is lexicographic ordering: (0, 0, 10), (0, 1, 9), (0, 2, 8), (0, 3, 7), ....

Note that the last (Kth) element of the vector is uniquely determined by the sum of the first K-1.

I'm sure there's a nice way to immediately jump to whatever index in this list, but I can't think of it right now....enumerating the possible outcomes and walking over them will work, but will probably be slower than necessary. Here's some code for that (though we actually use reverse lexicographic ordering here...).

from itertools import islice, combinations_with_replacement
from functools import reduce
from math import factorial
from operator import mul
import random

def _enum_cands(total, length):
    # get all possible ways of choosing 10 of our indices
    # for example, the first one might be  0000000000
    # meaning we picked index 0 ten times, for [10, 0, 0]
    for t in combinations_with_replacement(range(length), 10):
        cand = [0] * length
        for i in t:
            cand[i] += 1
        yield tuple(cand)

def int_vec_with_sum(total, length):
    num_outcomes = reduce(mul, range(total + 1, total + length)) // factorial(length - 1)
    # that's integer division, even though SO thinks it's a comment :)
    idx = random.choice(range(num_outcomes))
    return next(islice(_enum_cands(total, length), idx, None))

As shown in the histogram above, this is actually uniform over possible outcomes. It's also easily adaptable to upper/lower bounds on any individual element; just add the condition to _enum_cands.

This is slower than either of the other answers: for sum 10 length 3, I get

  • 14.7 us using np.random.multinomial,
  • 33.9 us using mgilson's,
  • 88.1 us with this approach

I'd expect that the difference would get worse as the number of possible outcomes increases.

If someone comes up with a nifty formula for indexing into these vectors somehow, it'd be much better....

Community
  • 1
  • 1
Danica
  • 28,423
  • 6
  • 90
  • 122
  • You make a great point about this distribution of solutions in the multinomial implementation. My first hunch about all of this is that the multinomial idea is the one I desire. your comment about "In the multinomial approach, of course 3 3 4 is going to be much more likely than 0 0 10" is exactly what I had in mind. You applied Law of Large Numbers and the multinomial method has a frequency plot that I like. Thanks for your input! I appreciate that you point this out – torrho Jul 08 '12 at 07:52
  • I suppose one underlying problem of this multinomial distribution solutions is a way to 'correct' the likehood of outputs for a sum of X and a vector length Y such that: the likelihood of an output is the same to any other alternative. I'll leave that one for another day. :( – torrho Jul 08 '12 at 08:11
2

The most efficient way to sample uniformly from the set of partitions of N elements into K bins is to use a dynamic programming algorithm, which is O(KN). There are a multichoose (http://mathworld.wolfram.com/Multichoose.html) number of possibilities, so enumerating every one will be very slow. Rejection sampling and other monte-carlo methods will also likely be very slow.

Other methods people propose, like sampling from a multinomial do not draw samples from a uniform distribution.

Let T(n,k) be the number of partitions of n elements into k bins, then we can compute the recurrence

T(n,1)=1 \forall n>=0
T(n,k)=\sum_{m<=n} T(n-m,k-1)

To sample K elements that sum to N, sample from K multinomial distributions going "backward" in the recurrence: Edit: The T's in the multinomial's below should be normalized to sum to one before drawing each sample.

n1 = multinomial([T(N,K-1),T(N-1,K-1),...,T(0,K-1)])
n2 = multinomial([T(N-n1,K-1),T(N-n1-1,K-1),...,T(0,K-1)])
...
nK = multinomial([T(N-sum([n1,...,n{k-1}]),1),T(N-sum([n1,...,n{k-1}])-1,1),...,T(0,1)])

Note: I am allowing 0's to be sampled.

This procedure is similar to sampling a set of hidden state from a segmental semi-markov model (http://www.gatsby.ucl.ac.uk/%7Echuwei/paper/icml103.pdf).

user1149913
  • 4,463
  • 1
  • 23
  • 28
1

Here's a pretty straight forward implementation.

import random
import math

def randvec(vecsum, N, maxval, minval):
    if N*minval > vecsum or N*maxval < vecsum:
        raise ValueError ('Cannot create desired vector')

    indices = list(range(N))
    vec = [random.randint(minval,maxval) for i in indices]
    diff = sum(vec) - vecsum # we were off by this amount.

    #Iterate through, incrementing/decrementing a random index 
    #by 1 for each value we were off.
    while diff != 0:  
        addthis = 1 if diff > 0 else -1 # +/- 1 depending on if we were above or below target.
        diff -= addthis

        ### IMPLEMENTATION 1 ###
        idx = random.choice(indices) # Pick a random index to modify, check if it's OK to modify
        while not (minval < (vec[idx] - addthis) < maxval):  #operator chaining.  If you don't know it, look it up.  It's pretty cool.
            idx = random.choice(indices) #Not OK to modify.  Pick another.

        vec[idx] -= addthis #Update that index.

        ### IMPLEMENTATION 2 ###
        # random.shuffle(indices)
        # for idx in indices:
        #    if minval < (vec[idx] - addthis) < maxval:
        #        vec[idx]-=addthis
        #        break
        #
        # in situations where (based on choices of N, minval, maxval and vecsum)
        # many of the values in vec MUST BE minval or maxval, Implementation 2
        # may be superior.

    return vec

a = randvec(1000,20,100,1)
print sum(a)
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • 2
    Well it gives a nonuniform distribution, which might be an issue. It depends on what the OP wants. – Antimony Jul 08 '12 at 03:59
  • One problem with this is that the last element could exceed UpperBound. – DSM Jul 08 '12 at 03:59
  • @Antimony +1 I agree. The first value will have a much higher chance of being large compared to subsequent values (see my answer below). – user Jul 08 '12 at 04:01
  • @DSM -- You're right. I didn't read the code above very carefully. I've edited with a new solution. – mgilson Jul 08 '12 at 04:52
  • @Oliver -- updated. I think/hope this is a nicer solution than posted previously. – mgilson Jul 08 '12 at 04:55
  • @mgilson I tried to read your code and found it difficult, so I added some spaces and made a few things more pythonic. Now I can actually understand what it's doing. :) – Danica Jul 08 '12 at 05:18
  • @Dougal -- I thought a function like `random.choice` must exist, but I didn't know the name. `sample` was the best I could come up with. Adding space is always a good idea, so I heartily agree. The operator chaining is sometimes a little difficult for me to understand (Of course, I spend a lot of time coding in Fortran, so that may be why). Would you have any objection to at least adding parenthesis around the `vec[idx] - addthis` in the `while` condition? – mgilson Jul 08 '12 at 05:25
  • @mgilson Parens around `vec[idx] - addthis` would definitely be reasonable - and yeah, Python's operator chaining is a little weird if you're used to thinking in low-level languages. I'd drop the parens in your implementation 2 `if` statement, though. – Danica Jul 08 '12 at 05:28
  • @Dougal -- and I agree that `math.copysign` is a bit difficult to handle, but I expect it to be slightly more efficient than the ternary operator. `sign` is a very common function in other languages. Note that in python 2.x, the same could be achieved by `cmp(diff,0)`, but (alas) in python 3.x, `cmp` is gone. – mgilson Jul 08 '12 at 05:28
  • `%timeit int(math.copysign(1, np.random.normal()))` gives `678 ns per loop`; `%timeit 1 if np.random.normal() > 0 else -1` gives `288 ns per loop`. So actually the ternary is both faster and more readable. :) (Dropping the `int` call drops it to 471 ns, still slower.) In 2.7, `cmp(np.random.normal(), 0)` is actually slower than `copysign`, though only slightly. – Danica Jul 08 '12 at 05:30
  • @Dougal -- Parenthesis dropped. Thanks for the timing. I suppose python can optimize out the 2 (expensive) function calls with the ternary -- I wonder if the problem would be better if copysign returned an int already. *sigh* languages never behave the way you expect them to. Thanks for the input. It's a better answer with your edits -- Although, (thankfully), the algorithm didn't have to change. – mgilson Jul 08 '12 at 05:33
0

This version will give a uniform distribution:

from random import randint

def RunInt(VectorSize, Sum):
   x = [randint(0, Sum) for _ in range(1, VectorSize)]
   x.extend([0, Sum])
   x.sort()
   return [x[i+1] - x[i] for i in range(VectorSize)]
finnw
  • 47,861
  • 24
  • 143
  • 221
  • 1
    This returns some invalid results, e.g. `(10, 0, 1)` when called with 3, 10. The distribution also isn't actually uniform: http://i.imgur.com/tPTxC.png. All but one of the outcomes on the less-common peak are valid outcomes. – Danica Jul 08 '12 at 19:28
0

Just to give you another approach, implement a partition_function(X) and randomly choose a number between 0 and the length of partition_function(1000) and there you have it. Now you just need to find an efficient way to calculate a partition function. These links might help:

http://code.activestate.com/recipes/218332-generator-for-integer-partitions/

http://oeis.org/A000041

EDIT: Here is a simple code:

import itertools
import random
all_partitions = {0:set([(0,)]),1:set([(1,)])}

def partition_merge(a,b):
    c = set()
    for t in itertools.product(a,b):
        c.add(tuple(sorted(list(t[0]+t[1]))))
    return c

def my_partition(n):
    if all_partitions.has_key(n):
        return all_partitions[n]
    a = set([(n,)])
    for i in xrange(1,n/2+1):
        a = partition_merge(my_partition(i),my_partition(n-i)).union(a)
    all_partitions[n] = a
    return a

if __name__ == '__main__':
    n = 30
    # if you have a few years to wait uncomment the next line
    # n = 1000
    a = my_partition(n)
    i = random.randint(0,len(a)-1)
    print(list(a)[i])
zenpoy
  • 19,490
  • 9
  • 60
  • 87
0

What with:

import numpy as np
def RunInt(VectorSize, Sum):
    a = np.array([np.random.rand(VectorSize)])
    b = np.floor(a/np.sum(a)*Sum) 
    for i in range(int(Sum-np.sum(b))):
        b[0][np.random.randint(len(b[0]))] += 1
    return b[0]
ChaimBlau
  • 11
  • 1
  • This problem is equivalent to randomly sampling a [partition](http://en.wikipedia.org/wiki/Partition_(number_theory)) whose size is VectorSize. – torrho Jun 03 '15 at 14:16