0

I want to create random np.array with values including only [0.05, 0.1, 0.15, ... 0.9, 0.95, 1] with the sum of the values = 1

I know how to create a random array, for example, of 5 elements, so that the sum is equal to 1:

array = np.random.random(5)
array /= np.sum(array)

But how can I make that the values in this array are only from [0.05, 0.1, 0.15, ... 0.9, 0.95, 1]?

Upd. This method works, but maybe there are more pythonic ways. 100 arrays

sets = []
while len(sets) < 100:
    array = np.random.choice(np.arange(0, 1.05, 0.05), 5)
    if np.sum(array) == 1:
        sets.append(array)
JSer1
  • 11
  • 4

1 Answers1

0

Your problem is equivalent to the following problem:

  1. Choose 5 random integers in [1, 20] with a sum of 20, where the integers appear in random order.
  2. Divide the chosen integers by 20.

The Python code below shows how this can be implemented. It has the following advantages:

  • It does not use rejection sampling.
  • It chooses uniformly at random from among all combinations that meet the requirements.

It's based on an algorithm by John McClane, which he posted as an answer to another question. I describe the algorithm in another answer.

import random # Or secrets

def _getSolTable(n, mn, mx, sum):
        t = [[0 for i in range(sum + 1)] for j in range(n + 1)]
        t[0][0] = 1
        for i in range(1, n + 1):
            for j in range(0, sum + 1):
                jm = max(j - (mx - mn), 0)
                v = 0
                for k in range(jm, j + 1):
                    v += t[i - 1][k]
                t[i][j] = v
        return t

def intsInRangeWithSum(numSamples, numPerSample, mn, mx, sum):
        """ Generates one or more combinations of
           'numPerSample' numbers each, where each
           combination's numbers sum to 'sum' and are listed
           in any order, and each
           number is in the interval '[mn, mx]'.
            The combinations are chosen uniformly at random.
               'mn', 'mx', and
           'sum' may not be negative.  Returns an empty
           list if 'numSamples' is zero.
            The algorithm is thanks to a _Stack Overflow_
          answer (`questions/61393463`) by John McClane.
          Raises an error if there is no solution for the given
          parameters.  """
        adjsum = sum - numPerSample * mn
        # Min, max, sum negative
        if mn < 0 or mx < 0 or sum < 0:
            raise ValueError
        # No solution
        if numPerSample * mx < sum:
            raise ValueError
        if numPerSample * mn > sum:
            raise ValueError
        if numSamples == 0:
            return []
        # One solution
        if numPerSample * mx == sum:
            return [[mx for i in range(numPerSample)] for i in range(numSamples)]
        if numPerSample * mn == sum:
            return [[mn for i in range(numPerSample)] for i in range(numSamples)]
        samples = [None for i in range(numSamples)]
        table = _getSolTable(numPerSample, mn, mx, adjsum)
        for sample in range(numSamples):
            s = adjsum
            ret = [0 for i in range(numPerSample)]
            for ib in range(numPerSample):
                i = numPerSample - 1 - ib
                # Or secrets.randbelow(table[i + 1][s])
                v = random.randint(0, table[i + 1][s] - 1)
                r = mn
                v -= table[i][s]
                while v >= 0:
                    s -= 1
                    r += 1
                    v -= table[i][s]
                ret[i] = r
            samples[sample] = ret
        return samples

Example:

weights=intsInRangeWithSum(
   # One sample
   1,
   # Count of numbers per sample
   5,
   # Range of the random numbers
   1, 20,
   # Sum of the numbers
   20)
# Divide by 100 to get weights that sum to 1
weights=[x/20.0 for x in weights[0]]

In any case, your code below:

array = np.random.random(5)
array /= np.sum(array)

...does not result in a uniform random combination of numbers with a sum of 1. Instead, use the following instead of that code, noting that uniform arrivals are exponentially spaced, not uniformly spaced. (To be clear, this does not solve the problem in your question; this is only an observation. Never mind the fact that numpy.random.* functions are now legacy functions since NumPy 1.17, in part because they use global state.)

array = np.random.exponential(5)
array /= np.sum(array)
Peter O.
  • 32,158
  • 14
  • 82
  • 96