-1

Is there any well-known algorithm which can generate an array of weighted random numbers having the same constant sum?

This problem can be stated in another way: Say, I have a total value of 20. Which should be distributed into 3 parts each of them has a weight of 2,4,3 respectively. So, I need 3 random numbers which will have a sum of 20 but the distribution will follow the weight.

I have tried:

Range=20
W=[2,4,3]
Prob=[i/float(sum(W)) for i in W]
Weighted_array=np.random.multinomial(Range, Prob)

Is there any better option?

Joshua
  • 40,822
  • 8
  • 72
  • 132
  • I have tried with random.multinomial() in python. – Imam Al Razi Jun 17 '18 at 20:50
  • Please give more detail. What kind of numbers do you mean--only positive integers, or are float numbers allowed? What do you mean by "the distribution will follow the weight"? What kind of probability distribution do you want for the results--some kind of uniform distribution over all possibilities, you don't care, or something else? – Rory Daulton Jun 17 '18 at 20:59
  • Sorry for unclarity. I need to generate integers. For that example the weighted_array can be: [4,10,6],[3,12,5] etc. "the distribution will follow the weight" means the resultant array elements will be weighted according to the W array elements. Hope that makes clear. – Imam Al Razi Jun 17 '18 at 21:03
  • But you have 20 potential numbers sampled and only three weights. Suppose I use your code as a black box, sample 1mln samples and just put them in histogram from 1 (or zero? a bit unclear) to 20. How those three weight will manifest themselves? – Severin Pappadeux Jun 18 '18 at 00:00

1 Answers1

0

Symbolic:

This is a linear Diophantine equation of n=3 variables. To solve it analytically in python you can use sympy, as in this answer:

from random import randint
from sympy.solvers.diophantine import diop_linear
from sympy.abc import x, y, z

# Our original equation is 2 * x + 4 * y + 3 * z = 20
# Re-arrange to 2 * x + 4 * y + 3 * z - 20 = 0, and input the left side to sympy
general_solution = diop_linear(2 * x + 4 * y + 3 * z - 20)

def get_random_valid_triple():
    t0,t1 = general_solution[2].free_symbols
    # You can pick whatever bounds you want here
    a = randint(-100, 100)
    b = randint(-100, 100)
    solution = [s.subs({t0: a, t1:b}) for s in general_solution]
    print(solution)
# Get a random solution
get_random_valid_triple()

Brute Force:

Alternately, at least for small n and tight bounds on each variable, you can just precompute all possible solutions and use random.choice to choose one each time. For example, we restrict all of the variables to be positive, then they're forced to be in [0, 20 / coefficient], and there are only 14 solutions. We can generate these in python as follows:

import random
import itertools
n = 20
coefficients = [2, 4, 3]
valid_solutions = []
ranges = [range(0, n // k + 1) for k in coefficients]
for value in itertools.product(*ranges):
    if sum(value[j] * i for j, i in enumerate(coefficients)) == n:
        valid_solutions.append(value)

print("All solutions:")
print("\n".join(str(i) for i in valid_solutions))
print("Random solution:")
print(random.choice(valid_solutions))

This yields:

All solutions:
(0, 2, 4)
(0, 5, 0)
(1, 0, 6)
(1, 3, 2)
(2, 1, 4)
(2, 4, 0)
(3, 2, 2)
(4, 0, 4)
(4, 3, 0)
(5, 1, 2)
(6, 2, 0)
(7, 0, 2)
(8, 1, 0)
(10, 0, 0)
Random solution:
(10, 0, 0)
Ollin Boer Bohan
  • 2,296
  • 1
  • 8
  • 12
  • Thanks a lot for your answer. But the problem is I need a general solution for n variables. Because my array W can have 100 or more elements. So, I think diophantine will not be the efficient one. – Imam Al Razi Jun 17 '18 at 21:48