0

I'm trying to figure out a way to iterativly select random values from an array, based on their weights. I'm going to use the example from here, but bear with me- this is a slightly different question. Here is a class representing a broker:

public class Broker
{
    public string Name = string.Empty;
    public int Weight = 0;

    public Broker(string n, int w)
    {
        this.Name = n;
        this.Weight = w;
    }
}

Now, I have an array of brokers, and I want to randomly select a broker such that the probablity of Broker a to be selected equals a.Weight divided by the sum of all weights in the array.

The change here is that I want to do this n times, where n is the size of the array. So we can work with the numbers a little bit before starting (it's not gonna take less than O(n)). Also please note that the weights are of the same order of magnitude as n.

Here are the directions I thought of so far

  1. Build an array with a size that is the sum of all weights, and place each Broker a a.weights times in the array - then randomize a number between 0 and the size of the new array. The problem here - as mentioned, the weights are O(n), thus this is a O(n^2) algorithm.
  2. I found a way (which some might consider trivial) of solving this in O(nlog(n)). I assign each Broker the partial sum of weights, from the first and up to the previous broker. Then I draw a number from 0 to sum of weights, and find the broker with binary search. I could also have achieved this with balanced binary search trees.

Does anyone know of a better (i.e. O(n), or even O(log(log(n)))) solution? Or otherwise - can anyone prove we can't do it faster then O(nlog(n))?

Community
  • 1
  • 1
Ginandi
  • 843
  • 8
  • 20
  • How do you think your first soln is O(n^2). May be space complexity.. – Karthikeyan Jul 21 '13 at 18:20
  • Hum ! You have to be careful because your are not measuring the complexity of a deterministic algorithm. In particular, you are using random number which are computationally not free to get. In other word, to give a precise answer to that question, we need to agree on a model of random computations. For example, I would consider that getting n bits at random is a O(n) algorithm. Do you agree with that hypothesis ? – hivert Jul 21 '13 at 18:24
  • @Karthikeyan If he uses a data structure with O(1) access and delete in O(1) amortized (for example, a Java ArrayList), then picking and deleting n random elements will be in O(n), so it'll be less than the O(n^2) needed for the construction of that array. – G. Bach Jul 21 '13 at 18:24
  • @Ginandi I don't understand how the second solution is O(n log n). After each draw, you'd have to adjust the weights. – G. Bach Jul 21 '13 at 18:26
  • @hivert Given that the number of bits needed is logarithmic in the number of elements he wants to index, picking n random numbers is covered in O(n log n). – G. Bach Jul 21 '13 at 18:27
  • @G.Bach After each draw you try to find the closest (say, the closest one from under) - and using a binary search is O(log n). Again, we are not working with the original array, but with an array where the weights are partial sums as mentioned. We do that n times so O(n log n). – Ginandi Jul 21 '13 at 18:28
  • @Karthikeyan If you agree that the first example is O(n^2) space complexity, then it would take O(n^2) to fill the new array... – Ginandi Jul 21 '13 at 18:30
  • @Ginandi For some reason I thought you were only dealing with integer weights and that after each draw, the weight of the drawn element is reduced by 1. If that's not the case, the link David posted should be very useful. – G. Bach Jul 21 '13 at 18:30
  • @hivert Well, that is a very important question. If we assume that, like G.Bach said, we cannot randomize faster than O(log n), then there is my answer. It sounds right, but are we sure that there is no feasible solution for getting random number in, say, O(1)? – Ginandi Jul 21 '13 at 18:34
  • Can't you just assign the weights in the array as in your first solution, then [shuffle the array](http://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle)? That randomizes the array in O(n). You can then walk the array sequentially to sample it. – Jim Mischel Jul 21 '13 at 18:50
  • @Ginandi. I don't thinks the question "are we sure that there is no feasible solution for getting random number in, say, O(1)?" makes any sense. When one speak about ``O(n)`` you are working in a fixed computation **model** (EG http://en.wikipedia.org/wiki/Random_access_machine) the question of whether the model reflect reality or not is irrelevant about complexity question. So whether you can have or not an arbitrary large number of random bit in constant time is a certainly not a question feasability, but a question of what are the **axioms** of your computational model. – hivert Jul 21 '13 at 19:25
  • 1
    @hivert Let's assume that getting a random (log(n)-length) word takes O(1) time, just like every other word operation in the standard unit-cost RAM model. – David Eisenstat Jul 22 '13 at 01:16

0 Answers0