0
<?php

$gender = array(
  'Male'=>30, 
  'Female'=>50,
  'U' =>20);

$total = array_sum(array_values($gender)); 

$current = 0;
$rand = rand(1,$total);

foreach ($gender as $key=>$value)
{
    $current += $value;
    if ($current > $rand)
    {
        echo $key;
        break;
    }
}

?>

At the moment I am trying to generate a random value based on a weighted percentage. In this example, Male has a 30% chance, female 50 and U 20% chance. I had a feeling that the logic in the code was wrong, so I ran script a 100 times, and normally you would get 30 Males, however that wasn't the case. Is there a smarter way to do this?

3 Answers3

2

The logic is basically right, but you should use >= as the comparison operator. To see why this is right, suppose you just have two choices with equal probability:

$gender = array('Male' => 1, 'Female' => 1);

$rand will be either 1 or 2. When $rand is 1 you would expect to select Male. Your code will test 1 > 1 and it will fail, the correct test will be 1 >= 1, which will succeed.

Also, you should probably do more than 100 tests to verify a random algorithm. 1,000 would probably produce more representative results.

Barmar
  • 741,623
  • 53
  • 500
  • 612
1

You're on the right lines. There's a nice algorithm for doing this detailed in another StackOverflow answer here.

Your implementation could look like this:

function getWeightedRandom(array $options) {

    // calculate the total of all weights
    $combined = array_sum($options); 

    // generate a random number, where 0 <= $random < $combined
    $random = rand(0, $combined - 1);

    // keep subtracting weights until we drop below an option's weight
    foreach($options as $name => $weight) {
        if($random < $weight) {
            return $name;
        }
        $random -= $weight;
    }
}

// the weights to use for our trials (do not have to add up to 100)
$gender = array(
    'Male' => 30, 
    'Female' => 50,
    'U' => 20);

// used for keeping track of how many of each result
$results = array(
    'Male' => 0, 
    'Female' => 0,
    'U' => 0);

// run a large number of trials to properly test our accuracy
for($i = 0; $i < 100000; $i++) {
    $result = getWeightedRandom($gender);
    $results[$result]++;
}

print_r($results);

Output:

Array
(
    [Male] => 30013
    [Female] => 49805
    [U] => 20182
)

Looks pretty good to me!

Community
  • 1
  • 1
George Brighton
  • 5,131
  • 9
  • 27
  • 36
0

try this:

/**
 * random by rates
 * @param int $rates 
 * @param int $pow Decimal digits
 */
function randombyrates($rates,$pow){
        $much = pow(10, $pow);
    $max  = array_sum($rates) * $much;
    $rand = mt_rand(1, $max);
    $base = 0;
    foreach ($rates as $k => $v) {
        $min = $base * $much + 1;
        $max = ($base + $v) * $much;
        if ($min <= $rand && $rand <= $max) {
            return $k;
        } else {
            $base += $v;
        }
    }
    return false;
}
$gender = array(
   'Male'=>30, 
   'Female'=>50,
   'U' =>20);
echo randombyrates($gender);

good luck!