0

Suppose I have a population divided by nationality according to the following proportions (%):

percentages = {'Germany': 0.4, 'France': 0.25, 'Greece': 0.15, 'Poland': 0.1, 'Norway': 0.05, 'Others': 0.05}

Now I need to generate samples from this population. Is there a way in Python to generate a sample of size n from the population?

For example, if n = 50, I would expect to have something like:

sample = {'Germany': 22, 'France': 10, 'Greece': 8, 'Poland': 6, 'Norway': 3, 'Others': 1}
LJG
  • 601
  • 1
  • 7
  • 15
  • Does this answer your question? [A weighted version of random.choice](https://stackoverflow.com/questions/3679694/a-weighted-version-of-random-choice) – Peter O. Jan 22 '22 at 18:45

1 Answers1

1

There is a built in method in random

import random
random.choices(
     population=list(percentages.keys()), 
     weights=list(percentages.values()),
     k=50
)

So then you can do:

import random
percentages = {'Germany': 0.4, 'France': 0.25, 'Greece': 0.15, 'Poland': 0.1, 'Norway': 0.05, 'Others': 0.05}

r = random.choices(
     population=list(percentages.keys()),
     weights=list(percentages.values()),
     k=50
)

sample = {key: 0 for key in percentages}
for key in r:
    sample[key] += 1

print(sample)

Might not be the most efficiënt method, but it works.

Robin Dillen
  • 704
  • 5
  • 11
  • could use `collections.Counter` instead of the loop. Nice answer +1 –  Jan 22 '22 at 11:50