4

I would like to find distinct random numbers within a range that sums up to given number.

Note: I found similar questions in stackoverflow, however they do not address exactly this problem (ie they do not consider a negative lowerLimit for the range).

If I wanted that the sum of my random number was equal to 1 I just generate the required random numbers, compute the sum and divided each of them by the sum; however here I need something a bit different; I will need my random numbers to add up to something different than 1 and still my random numbers must be within a given range.

Example: I need 30 distinct random numbers (non integers) between -50 and 50 where the sum of the 30 generated numbers must be equal to 300; I wrote the code below, however it will not work when n is much larger than the range (upperLimit - lowerLimit), the function could return numbers outside the range [lowerLimit - upperLimit]. Any help to improve the current solution?

static void Main(string[] args)
{
    var listWeights = GetRandomNumbersWithConstraints(30, 50, -50, 300);
}

private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, int sum)
{
    if (upperLimit <= lowerLimit || n < 1)
        throw new ArgumentOutOfRangeException();

    Random rand = new Random(Guid.NewGuid().GetHashCode());
    List<double> weight = new List<double>();

    for (int k = 0; k < n; k++)
    {
        //multiply by rand.NextDouble() to avoid duplicates
        double temp = (double)rand.Next(lowerLimit, upperLimit) * rand.NextDouble();

        if (weight.Contains(temp))
            k--;
        else
            weight.Add(temp);
    }

    //divide each element by the sum
    weight = weight.ConvertAll<double>(x => x / weight.Sum());  //here the sum of my weight will be 1 

    return weight.ConvertAll<double>(x => x * sum);
}

EDIT - to clarify

Running the current code will generate the following 30 numbers that add up to 300. However those numbers are not within -50 and 50

-4.425315699
67.70219958
82.08592061
46.54014109
71.20352208
-9.554070146
37.65032717
-75.77280868
24.68786878
30.89874589
142.0796933
-1.964407284
9.831226893
-15.21652248
6.479463312
49.61283063
118.1853036
-28.35462683
49.82661159
-65.82706541
-29.6865969
-54.5134262
-56.04708803
-84.63783048
-3.18402453
-13.97935982
-44.54265204
112.774348
-2.911427266
-58.94098071
Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
ES2018
  • 438
  • 3
  • 15
  • Does your code work? – mjwills Jul 13 '18 at 12:46
  • Can you clarify. I don't see how you can have 30 random numbers adding to a specific number. You can have 29 random numbers and one carefully selected number adding to a specific number, or an infinite random series will at some point converge on any number. – Adam G Jul 13 '18 at 12:54
  • I agree with @AdamG. With random numbers, particularly negatives, you are not guaranteed to exactly sum up to 300. Do you have other requirements such as (1) find largest sum less than 300 or (2) smallest sum less than 300? – Rick Davin Jul 13 '18 at 13:16
  • 1
    Since you are using random floating point values, the goal is exactly hitting 300 is far more unlikely than if you were using random integers. – Rick Davin Jul 13 '18 at 13:21
  • If you scale the values to get at a sum of 300, you loose the -50 .. 50 constraint on the values themselves. And with negative numbers in the mix, the unscaled sum may be 0 ... – Hans Kesting Jul 13 '18 at 13:24
  • Why `sum` is int while numbers are `double` ? – Severin Pappadeux Jul 13 '18 at 13:46
  • @Rick Davin, thanks for your comment. Initially I started with random integers between -50 and 50 using the function rand.Next(lowerLimit, upperLimit). However, this solution is not good because there is no way I can retrieve distinct random numbers if the the total number that I need (the n in the function) is larger than the range (upperLimit - lowerLimit). – ES2018 Jul 13 '18 at 13:46
  • @Severin Pappadeux thanks for your suggestion, sum must be a double – ES2018 Jul 13 '18 at 13:55
  • I am interested in this, and shall post a solution shortly. You won't like it, but it'll do what you ask :) – Davesoft Jul 13 '18 at 14:12

3 Answers3

2

Ok, here how it could be done

We will use Dirichlet Distribution, which is distribution for random numbers xi in the range [0...1] such that

Sumi xi = 1

So, after linear rescaling condition for sum would be satisfied automatically. Dirichlet distribution is parametrized by αi, but we assume all RN to be from the same marginal distribution, so there is only one parameter α for each and every index.

For reasonable large value of α, mean value of sampled random numbers would be =1/n, and variance ~1/(n * α), so larger α lead to random value more close to the mean.

Ok, now back to rescaling,

vi = A + B*xi

And we have to get A and B. As @HansKesting rightfully noted, with only two free parameters we could satisfy only two constraints, but you have three. So we would strictly satisfy low bound constraint, sum value constraint, but occasionally violate upper bound constraint. In such case we just throw whole sample away and do another one.

Again, we have a knob to turn, α getting larger means we are close to mean values and less likely to hit upper bound. With α = 1 I'm rarely getting any good sample, but with α = 10 I'm getting close to 40% of good samples. With α = 16 I'm getting close to 80% of good samples.

Dirichlet sampling is done via Gamma distribution, using code from MathDotNet.

Code, tested with .NET Core 2.1

using System;

using MathNet.Numerics.Distributions;
using MathNet.Numerics.Random;

class Program
{
    static void SampleDirichlet(double alpha, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        var gamma = new Gamma(alpha, 1.0);

        double sum = 0.0;
        for(int k = 0; k != n; ++k) {
            double v = gamma.Sample();
            sum  += v;
            rn[k] = v;
        }

        if (sum <= 0.0)
            throw new ApplicationException($"SampleDirichlet:: sum {sum} is non-positive");

        // normalize
        sum = 1.0 / sum;
        for(int k = 0; k != n; ++k) {
            rn[k] *= sum;
        }
    }

    static bool SampleBoundedDirichlet(double alpha, double sum, double lo, double hi, double[] rn)
    {
        if (rn == null)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is null");

        if (alpha <= 0.0)
            throw new ArgumentException($"SampleDirichlet:: alpha {alpha} is non-positive");

        if (lo >= hi)
            throw new ArgumentException($"SampleDirichlet:: low {lo} is larger than high {hi}");

        int n = rn.Length;
        if (n == 0)
            throw new ArgumentException("SampleDirichlet:: Results placeholder is of zero size");

        double mean = sum / (double)n;
        if (mean < lo || mean > hi)
            throw new ArgumentException($"SampleDirichlet:: mean value {mean} is not within [{lo}...{hi}] range");

        SampleDirichlet(alpha, rn);

        bool rc = true;
        for(int k = 0; k != n; ++k) {
            double v = lo + (mean - lo)*(double)n * rn[k];
            if (v > hi)
                rc = false;
            rn[k] = v;
        }
        return rc;
    }

    static void Main(string[] args)
    {
        double[] rn = new double [30];

        double lo = -50.0;
        double hi =  50.0;

        double alpha = 10.0;

        double sum = 300.0;

        for(int k = 0; k != 1_000; ++k) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            Console.WriteLine($"Rng(BD), v = {q}");
            double s = 0.0;
            foreach(var r in rn) {
                Console.WriteLine($"Rng(BD),     r = {r}");
                s += r;
            }
            Console.WriteLine($"Rng(BD),    summa = {s}");
        }
    }
}

UPDATE

Usually, when people ask such question, there is an implicit assumption/requirement - all random numbers shall be distribution in the same way. It means that if I draw marginal probability density function (PDF) for item indexed 0 from the sampled array, I shall get the same distribution as I draw marginal probability density function for the last item in the array. People usually sample random arrays to pass it down to other routines to do some interesting stuff. If marginal PDF for item 0 is different from marginal PDF for last indexed item, then just reverting array will produce wildly different result with the code which uses such random values.

Here I plotted distributions of random numbers for item 0 and last item (#29) for original conditions([-50...50] sum=300), using my sampling routine. Look similar, isn't it?

enter image description here

Ok, here is a picture from your sampling routine, same original conditions([-50...50] sum=300), same number of samples

enter image description here

UPDATE II

User supposed to check return value of the sampling routine and accept and use sampled array if (and only if) return value is true. This is acceptance/rejection method. As an illustration, below is code used to histogram samples:

        int[] hh = new int[100]; // histogram allocated

        var s = 1.0; // step size
        int k = 0;   // good samples counter
        for( ;; ) {
            var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn);
            if (q) // good sample, accept it
            {
                var v = rn[0]; // any index, 0 or 29 or ....
                var i = (int)((v - lo) / s);
                i = System.Math.Max(i, 0);
                i = System.Math.Min(i, hh.Length-1);
                hh[i] += 1;

                ++k;
                if (k == 100000) // required number of good samples reached
                    break;
            }
        }
        for(k = 0; k != hh.Length; ++k)
        {
            var x = lo + (double)k * s + 0.5*s;
            var v = hh[k];
            Console.WriteLine($"{x}     {v}");
        }
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
0

Here you go. It'll probably run for centuries before actually returning the list, but it'll comply :)

    public List<double> TheThing(int qty, double lowest, double highest, double sumto)
    {
        if (highest * qty < sumto)
        {
            throw new Exception("Impossibru!");
            // heresy
            highest = sumto / 1 + (qty * 2);
            lowest = -highest;
        }
        double rangesize = (highest - lowest);
        Random r = new Random();
        List<double> ret = new List<double>();

        while (ret.Sum() != sumto)
        {
            if (ret.Count > 0)
                ret.RemoveAt(0);
            while (ret.Count < qty)
                ret.Add((r.NextDouble() * rangesize) + lowest);
        }

        return ret;
    }
Davesoft
  • 724
  • 1
  • 4
  • 10
0

I come up with this solution which is fast. I am sure it couldbe improved, but for the moment it does the job.

n = the number of random numbers that I will need to find

Constraints

  • the n random numbers must add up to finalSum the n random numbers

  • the n random numbers must be within lowerLimit and upperLimit

The idea is to remove from the initial list (that sums up to finalSum) of random numbers the numbers outside the range [lowerLimit, upperLimit].

Then count the number left of the list (called nValid) and their sum (called sumOfValid). Now, iteratively search for (n-nValid) random numbers within the range [lowerLimit, upperLimit] whose sum is (finalSum-sumOfValid)

I tested it with several combinations for the inputs variables (including negative sum) and the results looks good.

static void Main(string[] args)
{
    int n = 100;
    int max = 5000;
    int min = -500000;
    double finalSum = -1000;

    for (int i = 0; i < 5000; i++)
    {
        var listWeights = GetRandomNumbersWithConstraints(n, max, min, finalSum);

        Console.WriteLine("=============");
        Console.WriteLine("sum   = " + listWeights.Sum());
        Console.WriteLine("max   = " + listWeights.Max());
        Console.WriteLine("min   = " + listWeights.Min());
        Console.WriteLine("count = " + listWeights.Count());
    }
}

private static List<double> GetRandomNumbersWithConstraints(int n, int upperLimit, int lowerLimit, double finalSum, int precision = 6)
{
    if (upperLimit <= lowerLimit || n < 1) //todo improve here
        throw new ArgumentOutOfRangeException();

    Random rand = new Random(Guid.NewGuid().GetHashCode());

    List<double> randomNumbers = new List<double>();

    int adj = (int)Math.Pow(10, precision);

    bool flag = true;
    List<double> weights = new List<double>();
    while (flag)
    {
        foreach (var d in randomNumbers.Where(x => x <= upperLimit && x >= lowerLimit).ToList())
        {
            if (!weights.Contains(d))  //only distinct
                weights.Add(d);
        }

        if (weights.Count() == n && weights.Max() <= upperLimit && weights.Min() >= lowerLimit && Math.Round(weights.Sum(), precision) == finalSum)
            return weights;

        /* worst case - if the largest sum of the missing elements (ie we still need to find 3 elements, 
         * then the largest sum is 3*upperlimit) is smaller than (finalSum - sumOfValid)
         */
        if (((n - weights.Count()) * upperLimit < (finalSum - weights.Sum())) ||
            ((n - weights.Count()) * lowerLimit > (finalSum - weights.Sum())))
        {
            weights = weights.Where(x => x != weights.Max()).ToList();
            weights = weights.Where(x => x != weights.Min()).ToList();
        }

        int nValid = weights.Count();
        double sumOfValid = weights.Sum();

        int numberToSearch = n - nValid;
        double sum = finalSum - sumOfValid;

        double j = finalSum - weights.Sum();
        if (numberToSearch == 1 && (j <= upperLimit || j >= lowerLimit))
        {
            weights.Add(finalSum - weights.Sum());
        }
        else
        {
            randomNumbers.Clear();
            int min = lowerLimit;
            int max = upperLimit;
            for (int k = 0; k < numberToSearch; k++)
            {
                randomNumbers.Add((double)rand.Next(min * adj, max * adj) / adj);
            }

            if (sum != 0 && randomNumbers.Sum() != 0)
                randomNumbers = randomNumbers.ConvertAll<double>(x => x * sum / randomNumbers.Sum());
        }
    }

    return randomNumbers;
}
ES2018
  • 438
  • 3
  • 15
  • It doesn't look right, but if it is suits you, so be it. Please, take a look at update of my answer and see for yourself – Severin Pappadeux Jul 16 '18 at 05:19
  • @SeverinPappadeux Probably my solution is not the best - I am sure there is a room of improvements. Before posting my solution I also tried your solution, but it does not always work. To test is added the following code if(rn.Max() > hi || rn.Min() < lo) Console.WriteLine("Something wrong here after "); Just after var q = SampleBoundedDirichlet(alpha, sum, lo, hi, rn); and sometimes rn.Max() is greater than 50 – ES2018 Jul 16 '18 at 09:27
  • `but it does not always work` sure it does. `and sometimes rn.Max() is greater than 50` but that's how it supposed to work - I always return sampled values but with boolean flag, this is how I count good vs bad sample. User code supposed to check `q` and reject bad sample, so it would be `if (q) { doSomething(rn); }`. I wrote that with this alpha I got 40% of good samples, and with that alpha I got 80%. Let me update my answer once again to show how to use code. – Severin Pappadeux Jul 16 '18 at 13:24