0

Given n trials, with a probability p of winning each trial, what is the probability of winning r or more trials?

My thinking goes as follows: Each combination of wins and losses has probability p^w * (p-1)^(n-w) where w is number of wins.

Each number of wins can occur in nCr combinations, ex. winning 2 out 3 times means you might lose the first, the second or the third times, e.g. three combinations.

So the probability of winning 2 out of 3 times is 3C2 * p^2 * (1-p)^1. The probability of winning 2 or more times is just the sum of this calculation for 2 and 3 wins.

I have the following code:

import math

def nCr(n,r):
    f = math.factorial
    return f(n) / f(r) / f(n-r)

def prob_at_least(n, r, p):
    probs = [nCr(n,k)*pow(p,k)*pow(1.0-p,n-k) for k in range(r, n+1)]
    return sum(probs)

This code works, but is there a built-in function, or a shorter way to achieve the same?

Alec
  • 8,529
  • 8
  • 37
  • 63
BSkagen
  • 3
  • 1

2 Answers2

2

From the scipy.stats module, you could use binom.

>>> import scipy.stats as scs
>>> print(scs.binom.pmf(2, 5, .5))
0.3125

Edit: to get r or more:

>>> trials = 0
>>> n = 5
>>> r = 2
>>> p = .5
>>> for i in range(r):
    trials += scs.binom.pmf(i, n, p)
 r_or_more = 1 - trials

Edit: the solution given by ljeabmreosn gives the cumulative distribution function which doesn't require my loop.

Chris Charley
  • 6,403
  • 2
  • 24
  • 26
1

There are much faster ways to implement combinations:

import operator as op
from functools import reduce

def nCr(n, r):
    r = min(r, n-r)
    numerator = reduce(op.mul, range(n, n-r, -1), 1)
    denominator = reduce(op.mul, range(1, r+1), 1)
    return numerator / denominator

But if you're doing this a lot, you might want to consider a package like scipy, which has special.comb for efficient combination calculation

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64
Alec
  • 8,529
  • 8
  • 37
  • 63
  • Thanks a lot! Do you know if a faster or more standard method exists for my prob_at_least function too? – BSkagen May 25 '19 at 17:40
  • I don't think there's a more efficient formula, but doing it on a numpy array will certainly be faster than a python list. – Alec May 25 '19 at 17:45
  • If my answer helped, would you mind clicking the little check beneath the score, which will mark it "accepted"? – Alec May 25 '19 at 17:48
  • @AlecAlameddine How come your rep is 1? – Severin Pappadeux May 29 '19 at 19:37
  • And this answer is a bit wrong - in Python3 `/` is float division, so value returned from `nCr()` would be float. You should use `//` divison here, and in case of Python2 `import division from __future__` – Severin Pappadeux May 30 '19 at 14:36