How to do a number of random trials efficiently?

Question

Let say an event have the probability P to succeed. (0 < P < 1 ) and I have to make N tests to see if this happens and I want the total number of successes:

I could go

int countSuccesses = 0;
while(N-- > 0)
{
   if(Random.NextDouble()<P) countSuccesses++; // NextDouble is from 0.0 to 1.0
}

But is there not a more efficient way to do this? I want to have a single formula so I just can use ONE draw ~~random number~~ to determine the total number of successes. (EDIT The idea of using only one draw was to get below O(n))

I want to be able call a method

GetSuccesses( n, P)

and it to be O(1)

UPDATE
I will try to go with the MathNet.Numerics.Distributions.Binomial.Sample(P, n) even if it might be using more then only one random number I'll guess it will be faster than O(n) even if its not O(1). I'll benchmark that. Big thanks to David and Rici.

UPDATE
The binomial sample above was O(n) so it did not help me. But thanks to a comment done by Fred I just switched to
MathNet.Numerics.Distributions.Normal.Sample(mean, stddev) where
mean = n * P
stddev = Math.Sqrt(n * P * (1 - P)); and now it is O(1) !

Binomial Distribution? https://en.wikipedia.org/wiki/Binomial_distribution — Dmitry Bychenko, May 04 '18 at 14:04
Yes, but how to get the number of successes out of only ONE random number? — Nick Sick, May 04 '18 at 14:05
Are you talking about a Random Seed being the 'ONE random number'? — Ryan C, May 04 '18 at 14:06
https://stackoverflow.com/questions/1728736/c-numerical-algorithm-to-generate-numbers-from-binomial-distribution — Dmitry Bychenko, May 04 '18 at 14:07
Unless I'm misreading things here, are you essentially asking how to ensure that **one** coin flip (i.e. one randomized outcome) is exactly 50% heads and 50% tails? Because that is clearly impossible. [It is important to remember that the law only applies (as the name indicates) when a large number of observations is considered. **There is no principle that a small number of observations will coincide with the expected value** or that a streak of one value will immediately be "balanced" by the others (see the gambler's fallacy).](https://en.wikipedia.org/wiki/Law_of_large_numbers) — Flater, May 04 '18 at 14:07
@Flater No, I am making N actual draws but with only one random number. — Nick Sick, May 04 '18 at 14:10
@NickSick: Do you mean one random **seed**, or one random **number**? If the latter, then my comment remains correct. — Flater, May 04 '18 at 14:10
So you want to know the number of successes with N trials? That's N*P on average. Want to know number of trials to get a success? That's 1/P on average. I can't figure if that's what you want really — Fred, May 04 '18 at 14:12
@Fred, I want the same answer my loop gives me but without the loop and only supplying one random number. — Nick Sick, May 04 '18 at 14:14
Okay your question is so convoluted I didn't get to the end. So that's not what you want then... — Fred, May 04 '18 at 14:14
Well... If you're not sampling N numbers then the very best you can do is get the expected value — Fred, May 04 '18 at 14:14
@Fred, No, I know its possible from only ONE random number. Every outcome has its own probability and the sum of all probabilites is 1. So a number between 0 and 1 will be able to determine the number of successes if you think of having a list of all outcomes. — Nick Sick, May 04 '18 at 14:18
@NickSick: Your last comment here is seemingly arguing that revealing a single card from a deck can prove that the deck is complete and contains no duplicates; which is clearly impossible. — Flater, May 04 '18 at 14:20
@Flater no, but a single random number can easily represent a deck of cards in it current state. — Nick Sick, May 04 '18 at 14:21
@NickSick: Asking the same question again, are you thus referring to **one [random seed](https://en.wikipedia.org/wiki/Random_seed)**, not one random number? Because that has nothing to do with `Random.NextDouble()`. "Next" implies "the next value from the **same seed**". This would still not be a `O(1)` operation either, since you'd still be getting _several_ randomly generated numbers from the same seed. For N draws, that is still a `O(N)` complexity, regardless of whether you randomize the seed for every draw. — Flater, May 04 '18 at 14:23
Well... You could turn that random number into a gaussian sample with mean N*P, which would have the same distribution as your initial function — Fred, May 04 '18 at 14:27
Does that solve your problem? It is probably the closest you'll ever gonna get — Fred, May 04 '18 at 14:29
@Fred My thoughts also go to towards that. Could you elaborate on that. — Nick Sick, May 04 '18 at 14:43
Presumably you are looking for the inverse of the cumulative distribution function (CDF) for the binomial distribution (BINOM.INV in Excel). You generate a uniformly distributed random number in the range [0,1] and then compute the inverse CDF of that variate. Unfortunately, the computation is non-trivial. — rici, May 04 '18 at 15:05
@rici yes that makes sense, so the computation is not trivial? Meaning, there is no generalized fomula? — Nick Sick, May 04 '18 at 15:08
As far as I know. There are efficient accurate approximations for large N and p not too close to 0 or 1. — rici, May 04 '18 at 15:10
Also, if you have a fixed N and p you could compute the CDF and use a binary search to compute the inverse — rici, May 04 '18 at 15:11
You said "a single random number can easily represent a deck of cards in it current state." That's not entirely true. A single number can serve as the seed to a specific random number generator that will generate that particular arrangement of cards. Or, I guess you could use a single number in the range 0..52! to identify a particular permutation of the cards. Whatever the case, your original statement does not hold water. — Jim Mischel, May 04 '18 at 15:50
@JimMischel: I believe OP meant that a single number in the range [0, 52!) could represent an entire deck of cards. Of course, that would require a bignum package. But suppose you wanted to sample the number of Aces in a draw of seven cards? If you had an algorithmic enumeration of the possible draws, you could select one using a number in the range [0,133784560), and then count the aces. More usefully, if you had a CDF for the number of aces, then you could just compute a single uniform variate and then look it up in the CDF. So the question is certainly meaningful. — rici, May 04 '18 at 16:00

David Browne - Microsoft · Answer 1 · 2018-05-04T15:47:12.377

2

Per @rici for small N you can use the CDF or PMF of the Binomial Distribution, and simply compare the random input with the probabibilities for 0,1,2..N successes.

Something like:

  static void Main(string[] args)
    {

        var trials = 10;
        var trialProbability = 0.25;
        for (double p = 0;  p <= 1; p += 0.01)
        {
            var i = GetSuccesses(trials, trialProbability, p);
            Console.WriteLine($"{i} Successes out of {trials} with P={trialProbability} at {p}");
        }
        Console.ReadKey();

    }
    static int GetSuccesses(int N, double P, double rand)
    {
        for (int i = 0; i <= N; i++)
        {
            var p_of_i_successes = MathNet.Numerics.Distributions.Binomial.PMF(P, N, i);
            if (p_of_i_successes >= rand)
                return i;

            rand -= p_of_i_successes;

        }
        return N;


    }

edited May 04 '18 at 15:47

answered May 04 '18 at 15:41

David Browne - Microsoft

80,331
6
39
67

1

Surely you could just use MathNet.Numerics.Distributions.Binomial.Sample to directly produce the desired result? I presume that that function is faster and more efficient than the code you show :-) (I've never used C#, so I'm really treading water here. But that is certainly how I'd do it in C++, using the standard library equivalent.) – rici May 04 '18 at 16:05
1

@rici The Sample() API is a bit different, as it requires a random number generator, and doesn't support passing in the random value directly. Not sure if that matters to OP or not. – David Browne - Microsoft May 04 '18 at 17:53
1

Fair enough. It's not a direct answer, but the intent serms to be to acquire one or more samples from a binomial distribution, ideally in O(1) (per sample), or at least in something less than O(n). If Sample() can accomplish that, it's at least worth mentioning. So I mentioned it :-) – rici May 04 '18 at 18:29
@rici, and David, Thank you guys, You are both really making sense to me. – Nick Sick May 04 '18 at 20:02

score 1 · Answer 2 · answered May 04 '18 at 15:11

I'm not going to write formula here, as it's already in wiki, and I don't really know good formatting here for such things.

Probability for each outcome can be determined by Bernulli formula https://en.wikipedia.org/wiki/Bernoulli_trial

What you need to do is to calculate binominal coefficient, then probability calculation becomes quite easy - multiply binominal coefficient by p and q in appropriate powers. Fill in array P[0..n] that contains probability for each outcome - number of exactly i successes.

After set up go from 0 up to n and calculate rolling sum of probabilities. Check lower/upper bounds against random value and once it's inside current interval, return the outcome.

So, deciding part will be like this:

sum=0;
for (int i = 0; i <= n; i++)
  if (sum-eps < R && sum+P[i]+eps > R)
    return i;
  else
    sum+=P[i];

Here eps is small floating point value to overcome floating point rounding issues, R is saved random value, P is an array of probabilities I mentioned before.

Unfortunately, this method is not practical for big N (20 or 100+):

you'll get quite big impact of rounding errors
random numbers generator can be not determinitive enough to cover every possible outcome with proper probabilities distribution

score 0 · Answer 3 · edited Jun 20 '20 at 09:12

Based on how you've phrased the question, this is impossible to do.

Your are essentially asking how to ensure that a single coin flip (i.e. one randomized outcome) is exactly 50% heads and 50% tails, which is impossible.

Even if you were to use two random numbers, where you expect one heads and one tails; this test would fail in 50% of all cases (because you might get two heads or two tails).

Probablility is founded on the law of large numbers. This explicitly states that a small sampling cannot be expected to accurately reflect the expected outcome.

The LLN is important because it guarantees stable long-term results for the averages of some random events. For example, while a casino may lose money in a single spin of the roulette wheel, its earnings will tend towards a predictable percentage over a large number of spins. Any winning streak by a player will eventually be overcome by the parameters of the game. It is important to remember that the law only applies (as the name indicates) when a large number of observations is considered. There is no principle that a small number of observations will coincide with the expected value or that a streak of one value will immediately be "balanced" by the others (see the gambler's fallacy).

When I asked this as a comment; you replied:

@Flater No, I am making N actual draws but with only one random number.

But this doesn't make sense. If you only use one random value, and keep using that same value, then every draw is obviously going to give you the exact same outcome (that same number).

The closest I can interpret your question in a way that is not impossible would be that you were mistakenly referring to a single random seed as a single random number.

A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator.

For a seed to be used in a pseudorandom number generator, it does not need to be random. Because of the nature of number generating algorithms, so long as the original seed is ignored, the rest of the values that the algorithm generates will follow probability distribution in a pseudorandom manner.

However, your explicitly mentioned expectations seem to disprove that assumption. You want to do something like:

GetSuccesses( n, P, Random.NextDouble())

and you're also expecting to get a O(1) operation, which flies in the face of the law of large numbers.

If you're actually talking about having a single random seed; then your expectation are not correct.

If you make N draws, the operation is still of O(N) complexity. Whether you randomize the seed after every draw or not is irrelevant, it's always O(N).
GetSuccesses( n, P, Random.NextDouble()) is going to give you one draw, not one seed. Regardless of terminology used, your expectation of the code is not related to using the same seed for several draws.

As the question is currently phrased; what you want is impossible. Repeated comments for clarification by several commenter have not yet yielded a clearer picture.

As a sidenote, I find it very weird that you've answered to every comment except when directly asked if you are talking about a seed instead of a number (twice now).

Hopefully next time you will not waste your time answering completely unclear questions :) — Evk, May 04 '18 at 14:40
I have always talked about a random number as given to you by Random.NextDouble() but maybe I dont understand the difference. — Nick Sick, May 04 '18 at 14:40
@NickSick: No. Your explanation is all over the place and repeatedly contradicts itself. Going by your update, R is not a random number, it is the ratio of `successes/draws`, which is effectively an approximation of **the probability**. In your own question, how is R meaningfully different from P? — Flater, May 04 '18 at 14:41
@NickSick: Not only do you not understand the difference, you have refused to respond to a question pointing out the difference three times in a row now. That is just a waste of everybody's time; the answer was essentially given the first time I asked if you knew the difference. — Flater, May 04 '18 at 14:43
@Evk: If OP now learns the difference between a seed and a randomly generated number, then my time was not wasted. — Flater, May 04 '18 at 14:44
@Flater, My example shows that you with one random number can determine the outcome of three coin tosses. And yes R is a random number given by Random:NextDouble() — Nick Sick, May 04 '18 at 14:45
@NickSick: For the fourth and last time, **that is a random seed** (i.e. a collection of randomized outcomes). The only difference is that instead of a list of random numbers, you have a list of passes and fails. Which is achieved by **comparing the random numbers (from the seed) to the predetermined probability**. — Flater, May 04 '18 at 14:47
@Flater "The only difference is that instead of a list of random numbers, you have a list of passes and fails. Which is achieved by comparing the random numbers (from the seed) to the predetermined probability. " Yes, and instead of checking that list I would like to have a formula and calculate the number of heads (or whatever). — Nick Sick, May 04 '18 at 15:03
@NickSick That is effectively asking "how do I receive the calculated result without calculating it?" — Flater, May 04 '18 at 15:11

Nick Sick · Accepted Answer · 2018-05-07T11:40:33.350

@David and @rici pointed me to the
MathNet.Numerics.Distributions.Binomial.Sample(P, n)
A benchmark told me that it also O(n) and in par with my original

int countSuccesses = 0;
while(N-- > 0)
{
   if(Random.NextDouble()<P) countSuccesses++; // NextDouble is from 0.0 to 1.0
}

But thanks to a comment done by Fred:

You could turn that random number into a gaussian sample with mean N*P, which would have the same distribution as your initial function

I just switched to
MathNet.Numerics.Distributions.Normal.Sample(mean, stddev) where
mean = n * P
stddev = Math.Sqrt(n * P * (1 - P)); and now it is O(1) !

and the function I wanted got to be:

    private int GetSuccesses(double p, int n)
    {
        double mean = n * p;
        double stddev = Math.Sqrt(n * p * (1 - p));
        double hits = MathNet.Numerics.Distributions.Normal.Sample(Random, mean, stddev);
        return (int)Math.Round(hits, 0);
    }

As Paul pointed out, this is an approximation, but one that I gladly accept.

The normal distribution you have is not the same distribution as your original binomial distribution, but it will be similar for large N (by the central limit theorem). — Paul Hankin, May 06 '18 at 18:45
https://en.wikipedia.org/wiki/Binomial_distribution#Normal_approximation gives a good summary — Paul Hankin, May 06 '18 at 18:49

How to do a number of random trials efficiently?

4 Answers4