Algo (Source: Elements of Programming Interviews, 5.16)
You are given n numbers as well as probabilities
p0, p1,.., pn-1
which sum up to 1. Given a rand num generator that produces values in[0,1]
uniformly, how would you generate one of the n numbers according to their specific probabilities.
Example
If numbers are 3, 5, 7, 11, and the probabilities are
9/18, 6/18, 2/18, 1/18
, then in 1000000 cals to the program, 3 should appear 500000 times, 7 should appear 111111 times, etc.
The book says to create intervals p0, p0 + p1, p0 + p1 + p2, etc
so in the example above the intervals are [0.0, 5.0), [0.5, 0.0.8333), etc
and combining these intervals into a sorted array of endpoints could look something like [1/18, 3/18, 9/18, 18/18]
. Then run the random function generator, and find the smallest element that is larger than the generated element - the array index that it corresponds to maps to an index in the given n
numbers.
This would require O(N) pre-processing time and then O(log N) to binary search for the value.
I have an alternate solution that requires O(N) pre-processing time and O(1) execution time, and am wondering what may be wrong with it.
Why can't we iterate through each number in n
, multiplying [n] * 100 * probability that matches with n
. E.g [3] * (9/18) * 100
. Concatenate all these arrays to get, at the end, a list of 100 elements, with the number of elements for each mapping to how likely it is to occur. Then, run the random num function and index into the array, and return the value.
Wouldn't this be more efficient than the provided solution?