0

In an interview, I was asked to describe a function that could return a random number given an object with {names:weights} as an input. I would be allowed to use a library to generate numbers between [0,1] but nothing else. I answered that the task need be broken in down into:

  1. Ensuring that weights be cumulative [0.25, 0.5, 0.25] -> [0.25, 0.75, 1], saving as c_arr

  2. Produce a random number between [0,1] save as a val, then iterate through my c_arr via a for loop. If val is > the ith element in c_arr, continue, else return the name associated with the ith element.

My interviewer noted that this was acceptable, however, asked if I could articulate its performance using Big O notation. I'm not very knowledgeable here but answered that it should be in linear time. He responded that this is true; however, the pseudo could function could be improved to log(n) time.

To my knowledge, log(n) time would mean not iterating through each and every integer in the range [0,1,2,3,4,5,6,7,8...x] but rather something like [1,2,4,8,...X]. I'm not sure that this is validated. But even if it is, I'm not sure how the pseudo code function could be improved such that it didn't need to iterate through every element in c_arr in a linear fashion.

Could someone explain?

jbuddy_13
  • 902
  • 2
  • 12
  • 34

1 Answers1

1

https://softwareengineering.stackexchange.com/questions/150616/get-weighted-random-item/288599

Stealing the answer from Emperor Orionii [I take no credit]:

Now to the interesting part. How efficient is this approach and what's most efficient solution? My piece of code requires O(n) memory and run in O(n) time. I don't think it can be done with less than O(n) space but time complexity can be much lower, O(log n) in fact. The trick is to use binary search instead of regular for loop.

double r = Random.Next() * totalSum;
int lowGuess = 0;
int highGuess = fruit.Count - 1;

while (highGuess >= lowGuess)
{
    int guess = (lowGuess + highGuess) / 2;
    if ( csum[guess] < r)
        lowGuess = guess + 1;
    else if ( csum[guess] - weight[guess] > r)
        highGuess = guess - 1;
    else
        return fruit[guess];
}

There is also a story about updating weights. In the worst case updating weight for one element causes the update of cumulative sums for all elements increasing the update complexity to O(n). That too can be cut down to O(log n) using binary indexed tree.

  • Great answer and thanks for linking the Q/A! The binary search tree is a genius approach. I've heard of it before but never implemented one myself so it wasn't a concept I would've instinctively known to reach for. – jbuddy_13 Jul 28 '21 at 17:52
  • 1
    One of the fastest alternatives if you are going to generate more than a handful of values with static probabilities is to use an [alias table](https://stackoverflow.com/a/17253335/2166798). Once the table is constructed, you can generate values in O(1) time. – pjs Jul 28 '21 at 20:28