1

I've searched through a number of similar questions, but unfortunately I haven't been able to find an answer to this problem. I hope someone can point me in the right direction.

I need to come up with a PHP function which will produce a random number within a set range and mean. The range, in my case, will always be 1 to 100. The mean could be anything within the range.

For example...

r = f(x)

where...

r = the resulting random number

x = the mean

...running this function in a loop should produce random values where the average of the resulting values should be very close to x. (The more times we loop the closer we get to x)

Running the function in a loop, assuming x = 10, should produce a curve similar to this:

     +
    + +
   +     +
  +             +     
+                               +

Where the curve starts at 1, peeks at 10, and ends at 100.

Unfortunately, I'm not well versed in statistics. Perhaps someone can help me word this problem correctly to find a solution?

John Conde
  • 217,595
  • 99
  • 455
  • 496
gunner1095
  • 145
  • 3
  • 11
  • Have you tried something ? Also please add a little example how the expected output could looks like – Rizier123 Apr 26 '15 at 20:11
  • This is more a statistics problem rather than a coding issue. Have you tried googling to see what formula you have to implement. – Rohit Gupta Apr 26 '15 at 20:17
  • @Rizier123 Ok, I added a sample curve to try to illustrate what I'm after. – gunner1095 Apr 26 '15 at 20:24
  • @Rohit Gupta, yes I looked at a number of posts, including this [link]http://stackoverflow.com/questions/3109670/generate-random-numbers-with-probabilistic-distribution But I just don't get what the answer is. – gunner1095 Apr 26 '15 at 20:25
  • Also, there are a lot of examples with standard distributions, but I couldn't find anything with non-standard distributions, like my example. – gunner1095 Apr 26 '15 at 20:29

2 Answers2

0

interesting question. I'll sum it up:

  1. We need a funcion f(x)
  2. f returns an integer
  3. if we run f a million times the average of the integer is x(or very close at least)

I am sure there are several approaches, but this uses the binomial distribution: http://en.wikipedia.org/wiki/Binomial_distribution

Here is the code:

function f($x){
    $min = 0;
    $max = 100;
    $curve = 1.1;
    $mean = $x;
    $precision = 5; //higher is more precise but slower

    $dist = array();

    $lastval = $precision;
    $belowsize = $mean-$min;
    $abovesize = $max-$mean;
    $belowfactor = pow(pow($curve,50),1/$belowsize);

    $left = 0;
    for($i = $min; $i< $mean; $i++){
        $dist[$i] = round($lastval*$belowfactor);
        $lastval = $lastval*$belowfactor;
        $left += $dist[$i];
    }
    $dist[$mean] = round($lastval*$belowfactor);

    $abovefactor = pow($left,1/$abovesize);
    for($i = $mean+1; $i <= $max; $i++){
        $dist[$i] = round($left-$left/$abovefactor);
        $left = $left/$abovefactor;
    }

    $map = array();
    foreach ($dist as $int => $quantity) {
        for ($x = 0; $x < $quantity; $x++) {
            $map[] = $int;
        }
    }

    shuffle($map);
    return current($map);
}

You can test it out like this(worked for me): $results = array();

for($i = 0;$i<100;$i++){
    $results[] = f(20);
}
$average = array_sum($results) / count($results);
echo $average;

It gives a distribution curve that looks like this: Distribution curve

Martin Gottweis
  • 2,721
  • 13
  • 27
  • Hmm... The average certainly seems correct, but I don't see any values over 40. In my scenario there should be values all the way up to 100 on occasion. Try adding this code to see the distribution... `$dist = array_count_values($results); sort($dist); var_dump($dist);` – gunner1095 Apr 26 '15 at 20:50
  • Oh, thats right. give me a minute, ill rewrite it a bit. – Martin Gottweis Apr 26 '15 at 21:14
  • That distribution curve is exactly what I'm after, but when I run your code I get a different result. Using the following values `$min = 1; $max = 100; $curve = 1.1; $mean = $x; $precision = 3;` I ran the function in a loop 1000 times and I got an average of 14 and values that loop to about 60 then stop. I'm going to continue to test it to make sure it's not something I'm doing wrong... – gunner1095 Apr 27 '15 at 00:44
  • I apologize, it appears I messed up the result set by sorting it incorrectly. The function seems to work! However, even when I loop it 5,000 times I don't get a single value above 83. I'm guessing this is due to the probability being very low. Can you confirm? – gunner1095 Apr 27 '15 at 01:53
  • Yep, the probability is rather low and there is not much we can do about it really. Maybe if you provided your specific case we can modify the distribution so it works better. The curve can be flatter/more linear so the probability of getting say a 95 wont be so tiny. – Martin Gottweis Apr 27 '15 at 11:17
  • Yes, a flatter curve could do the trick. In my specific use case I'm trying to optimize for hitting as many larger values as possible without going above the average. But, I'm not sure if that's even possible. – gunner1095 Apr 27 '15 at 14:31
0

I'm not sure if I got what you mean, even if I didn't this is still a pretty neat snippet:

<?php
    function array_avg($array) {  // Returns the average (mean) of the numbers in an array
        return array_sum($array)/count($array);
    }

    function randomFromMean($x, $min = 1, $max = 100, $leniency = 3) {

        /*
            $x          The number that you want to get close to
            $min        The minimum number in the range
            $max        Self-explanatory
            $leniency   How far off of $x can the result be
        */

        $res = [mt_rand($min,$max)];
        while (true) {
            $res_avg = array_avg($res);

            if ($res_avg >= ($x - $leniency) && $res_avg <= ($x + $leniency)) {
                return $res;
                break;
            }
            else if ($res_avg > $x && $res_avg < $max) {
                array_push($res,mt_rand($min, $x));
            }
            else if ($res_avg > $min && $res_avg < $x) {
                array_push($res, mt_rand($x,$max));
            }
        }
    }

    $res = randomFromMean(22);  // This function returns an array of random numbers that have a mean close to the first param.
?>

If you then var_dump($res), You get something like this:

array (size=4)
  0 => int 18
  1 => int 54
  2 => int 22
  3 => int 4

EDIT: Using a low value for $leniency (like 1 or 2) will result in huge arrays, since testing, I recommend a leniency of around 3.

Dendromaniac
  • 378
  • 1
  • 14
  • Looks like your function returns an array. Instead, I would like it return a single random value. Then, if I used the function in a loop the resulting values should create the distribution as described above. Hope that clarifies it a bit more. – gunner1095 Apr 27 '15 at 00:38
  • @gunner1095 I think I understand, The function I made does the loop for you, but if you want I can try to separate them. – Dendromaniac Apr 27 '15 at 00:43