2

I got a set of different events, all with their own odds to happen. Is there a way to calculate the odds for them combined, so that I get a the odds for 0, 1 2 and so on to happen.

The math is easy, but the number of calculations grow quick, so I was hoping there was a function that could help me.

Example with three factors :

Event | Yes | No
A     | 3%  | 97%
B     | 4%  | 96%
C     | 5%  | 95%

0 Happening : $A[no] * $ B[no] * $c[no]

1 happening : $A[yes] * $ B[no] * $c[no] + $A[no] * $ B[yes] * $c[no] + $A[no] * $ B[no] * $c[yes]

2 happening = $A[yes] * $ B[yes] * $c[no] + $A[yes] * $ B[no] * $c[yes] + $A[no] * $ B[yes] * $c[yes]

3 happening : $A[yes] * $ B[yes] * $c[yes]

This is easy to write in php, my problem is how this scale up if i add more events. Just adding one more doubles the number of caclulations, and soon the code would be really long.

So is there an easier way to do this? I'll be grateful for any tips or ideas.

Alok Patel
  • 7,842
  • 5
  • 31
  • 47
FingeNB
  • 223
  • 2
  • 10

2 Answers2

1

This is quite slow implementation.

Let's consider a case with 5 events.

Odds of 0 events happening is:

$no[0]  * $no[1]  * $no[2]  * $no[3]  * $no[4]

Odds of 1 event happening is:

$no[0]  * $no[1]  * $no[2]  * $no[3]  * $yes[4] +
$no[0]  * $no[1]  * $no[2]  * $yes[3] * $no[4]  +
$no[0]  * $no[1]  * $yes[2] * $no[3]  * $no[4]  +
...

where you go through all multiplications where there is exactly 1 'yes' choice.

Odds of 2 events happening is:

$no[0]  * $no[1]  * $no[2]  * $yes[3] * $yes[4] +
$no[0]  * $no[1]  * $yes[2] * $no[3]  * $yes[4] +
$no[0]  * $no[1]  * $yes[2] * $yes[3] * $no[4]  +
...

where you go through all multiplications where there is exactly 2 'yes' choices.

This can be generalized: To calculate odds of N events happening you go through all multiplications where there is exactly N 'yes' choices.

Now when you need to calculate all odds from 0 to 5 events happening, you need to go through all possible combinations of yes/no choices and add each multiplication to $odds[$yesCount].

$no[0]  * $no[1]  * $no[2]  * $no[3]  * $no[4]   ; added to $odds[0]
$no[0]  * $no[1]  * $no[2]  * $no[3]  * $yes[4]  ; added to $odds[1]
$no[0]  * $no[1]  * $no[2]  * $yes[3] * $no[4]   ; added to $odds[1]
$no[0]  * $no[1]  * $no[2]  * $yes[3] * $yes[4]  ; added to $odds[2]
$no[0]  * $no[1]  * $yes[2] * $no[3]  * $no[4]   ; added to $odds[1]
...
$yes[0] * $yes[1] * $yes[2] * $yes[3] * $yes[4]  ; added to $odds[5]

There are total of 2**5 = 32 different multiplications here, or generally 2**$eventCount.

It is easy to go through all these cases if we assign a number to each case from 0 to 2**$eventCount-1, and then use bits of this number to select whether 'yes' or 'no' choice of each event is included in multiplication, and finally add each multiplication result to $odds[$yesCount]:

// number of events
$eventCount = 5;

// odds of each event happening
$yes = [ 0.10, 0.50, 0.32, 0.66, 0.99 ];

// odds of each event not happening
$no = [];
for ($eventNumber = 0; $eventNumber < $eventCount; $eventNumber++) {
  $no[$eventNumber] = 1 - $yes[$eventNumber];
}

// initialize combined $odds to zero
$odds = [];
for ($n = 0; $n <= $eventCount; $n++) {
  $odds[$n] = 0;
}

// calculate combined odds
for ($case = 0; $case < 2 ** $eventCount; $case++) {
  $multiplication = 1;
  $yesCount = 0;
  for ($eventNumber = 0; $eventNumber < $eventCount; $eventNumber++) {
    if ($case & (1 << $eventNumber)) {
      $yesCount++;
      $multiplication *= $yes[$eventNumber];
    } else {
      $multiplication *= $no[$eventNumber];
    }
  }
  $odds[$yesCount] += $multiplication;
}

// show combined odds
for ($n = 0; $n <= $eventCount; $n++) {
  echo "Odds of " . $n . " events happening is " . $odds[$n] . "<br>\n";
}
Markus Laire
  • 2,837
  • 2
  • 17
  • 23
  • Thanks a lot! This works perfect, but any way to make it quicker would be great. The number of calculations are really high. At 25 items its over 33.000.000, and it grows quick. – FingeNB Aug 01 '16 at 21:58
  • @FingeNB Thinking this a bit more, I'm not sure whether this can be improved much for general case where each event has separate odds. Basic algorithm seems to require 2**N operations. (btw, I also tried this code in C and got about 6 more events compared to PHP, when both used same amount of time.) – Markus Laire Aug 02 '16 at 07:00
  • @MarksusLaire Again, thanks for the help! I was hoping to calculate up to 42 events, so its a crazy high number! The only input improvement I can think of is that some of the events got the same odds. Lik 5 with 3% 10 with 5% and some have the odds alone. Not sure how to implement it though. Seems like I need a supercomputer of some sort to calculate it. – FingeNB Aug 02 '16 at 08:10
  • @FingeNB When all odds are same, calculating this is really fast, see e.g. http://www.vassarstats.net/textbook/ch5apx.html I think it should be possible to use that formula to speed up calculation also when only some odds are same. – Markus Laire Aug 02 '16 at 09:14
0

Let's see what we actually need here. Let's assume, we have an array of chances:

$chances = [0.8, 0.3, 0.15];

What we need is to get those results with the following calculations:

// 0.119
$chance0 = (1-$array[0]) * (1-$array[1]) * (1-$array[2]);
// 0.548
$chance1 = $array[0] * (1-$array[1]) * (1-$array[2]) + (1-$array[0]) * $array[1] * (1-$array[2]) + (1-$array[0]) * (1-$array[1]) * $array[2];
// 0.297
$chance2 = $array[0] * $array[1] * (1-$array[2]) + $array[0] * (1-$array[1]) * $array[2] + (1-$array[0]) * $array[1] * $array[2];
// 0.036
$chance3 = $array[0] * $array[1] * $array[2];

What we actually need, is based on an integer that represents a number of successful chances, get every non-repeating combination from a pool of numbers, and make the rest "inversive", i.e. make them (1 - chance)

To achieve that, I used this answer and modified the code a bit. From the mentioned answer, we get non-repeating combinations of chances (meaning, that in case we get 2 successful chances, 0.8 and 0.15 are same as 0.15 and 0.8). Next, we use an array_diff to see what is left and assume those are failing chances. But since those are failing chances, we need to "reverse" them, calling array_map with the respective callback. The last thing is to calculate array_product to get the chance of the event happening.

And finally we just array_sum all the chances.

class Combinations implements Iterator
{
    protected $c = null;
    protected $s = null;
    protected $n = 0;
    protected $k = 0;
    protected $pos = 0;

    function __construct($s, $k) {
        if(is_array($s)) {
            $this->s = array_values($s);
            $this->n = count($this->s);
        } else {
            $this->s = (string) $s;
            $this->n = strlen($this->s);
        }
        $this->k = $k;
        $this->rewind();
    }
    function key() {
        return $this->pos;
    }
    function current() {
        $r = array();
        for($i = 0; $i < $this->k; $i++)
            $r[] = $this->s[$this->c[$i]];
        return is_array($this->s) ? $r : implode('', $r);
    }
    function next() {
        if($this->_next())
            $this->pos++;
        else
            $this->pos = -1;
    }
    function rewind() {
        $this->c = range(0, $this->k);
        $this->pos = 0;
    }
    function valid() {
        return $this->pos >= 0;
    }

    protected function _next() {
        $i = $this->k - 1;
        while ($i >= 0 && $this->c[$i] == $this->n - $this->k + $i)
            $i--;
        if($i < 0)
            return false;
        $this->c[$i]++;
        while($i++ < $this->k - 1)
            $this->c[$i] = $this->c[$i - 1] + 1;
        return true;
    }
}

function getTotalChances($calculateFrom, $successCount)
{
    $totalChance = [];

    foreach(new Combinations($calculateFrom, $successCount) as $success)
    {
        $failure = array_map(function($element) {
            return (1-$element);
        }, array_diff($calculateFrom, $success));

        $chance = array_product(array_merge($success, $failure));

        $totalChance[] = $chance;
    }

    return(array_sum($totalChance));
}

$calculateFrom = [0.8, 0.3, 0.15];

var_dump(getTotalChances($calculateFrom, 2));

Now if you run the following, you will get what you need:

var_dump(getTotalChances($calculateFrom, 0)); // 0.119
var_dump(getTotalChances($calculateFrom, 1)); // 0.548
var_dump(getTotalChances($calculateFrom, 2)); // 0.297
var_dump(getTotalChances($calculateFrom, 3)); // 0.036
Community
  • 1
  • 1
Kevin Kopf
  • 13,327
  • 14
  • 49
  • 66
  • Thanks! I tried this code asweel, it actually seems to run a bit slower, and on 25 items I got this error: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 100663304 bytes) in C:\Program Files (x86)\EasyPHP-Devserver-16.1\eds-www\test.php on line 72 – FingeNB Aug 02 '16 at 08:49
  • @FingeNB guess it depends on your system. I just tried a set of 30 items and it run in 92 seconds. – Kevin Kopf Aug 02 '16 at 08:59