find random country but probability of picking higher population country should be higher

Question

I am working on below interview question:

If you're given a list of countries and its corresponding population, write a function that will return a random country but the higher the population of the country, the more likely it is to be picked at random.

I came up with below logic:

Make two traversal on the list. In the first traversal, sum up all the population of every location to get the total population TOTAL_POP. In the second iteration, calculate % of each location's population against TOTAL_POP. For example, if location A has a popoulation 'a'. The percentage of population for A is (a/TOTAL_POP)*100.

Let's say, after these steps, we have the following values. Location A = 35% B = 22% C = 19% D = 20% E = 4%

Note that the percentages should add up to 100.

Now, randomly generate a number 'n' between 1 and 100.

if 1 <= n <=35 OUTPUT A 36 <= n <= 57 OUTPUT B

Is there any better way to solve this problem? Algorithm or code anything is fine. Also if we are gonna implement this in Java then what is the best data structure to use here?

This is perfectly fine. It is one of the standard ways to sample a discrete set with specified probabilities. A simple list (sorted by their ranges) is enough (do a binary search to find the country). — Nico Schertler, Nov 20 '18 at 07:29
why you should stick to 1 <= n <=35 , this is the not a fair random selection , you could have used "1 <= n <= 90" it is A, so it will have a higher probability to pick A. but it is even unfair random selection. i think what you need is build up a distribution . what you have come with is equivalent to the population distribution among countries. — hunter, Nov 20 '18 at 09:01
what about real figures rounding up to the same integer value ? which country are you going to select then ? And what about countries with a population < 1% ? — Laurent S., Nov 20 '18 at 11:55

RoberMP · Answer 1 · 2018-11-20T08:23:30.427

1

You can use TreeMap for this, it is O(log n) and has a handy api. Something like:

TreeMap<Integer, Country> map = new TreeMap<>();
map.put(percentCountry1, country1);
map.put(percentCountry1 + percentCountry2, country2);
// ...

int random = (new Random()).nextInt(100);
Country country = map.ceilingEntry(random).getValue();

edited Nov 20 '18 at 08:23

answered Nov 20 '18 at 08:16

RoberMP

1,306
11
22

1

You don't need to traverse once to find total population, and then compute percentages. This solution works if you replace percentCountry1 etc above with populationCountry1 (etc) and draw a random number between 1 and total world population (using double or 64-bit integer for the population, since 32-bit integer do not suffice). – Hans Olsson Nov 20 '18 at 09:10

find random country but probability of picking higher population country should be higher

1 Answers1

Linked