7

I have 3 labels: "A","B","C".

I want to generate a random list with 100 elements, 60% of them are "A", 30% are "B", 10% are "C".

How can I do this? (I am new in python, hope this question is not too silly.)


Edit: My question is slightly different from this question: Generate random numbers with a given (numerical) distribution

Just like in the comment, I want exactly 60% of them are "A", not every element has a 60% probability to be "A". So the numpy.random.choice() is not the solution for me.

Community
  • 1
  • 1
xirururu
  • 5,028
  • 9
  • 35
  • 64
  • Thats not really random then is it? – marsh Mar 22 '15 at 01:08
  • random with given distribution: ABAAACBAAAB – aaaaa says reinstate Monica Mar 22 '15 at 01:09
  • Ah! That makes more sense. – marsh Mar 22 '15 at 01:09
  • 1
    You need to clarify if you mean that *exactly* 60 of them will be A, and so you simply need to build a list and shuffle it, or if you want each element to have a 60% *chance* of being A (and so sometimes you'll get 65 As, sometimes 45 As, very rarely 5 As, and so on.) – DSM Mar 22 '15 at 01:10
  • 1
    Maybe some of the answers here help? http://stackoverflow.com/questions/4265988/generate-random-numbers-with-a-given-numerical-distribution – avacariu Mar 22 '15 at 01:13
  • @DSM Thanks for the answer! I mean actually the first case. But if I also want to do with the second case, what shall I do? – xirururu Mar 22 '15 at 01:13
  • @marsh it is random, just **non-uniform** – smci Mar 22 '15 at 04:27
  • @xirururu: do you just want the case where the proportions are (small) integers, or the genera case where they're arbitrary real numbers? Anyway, people have given you both. – smci Mar 22 '15 at 04:32
  • related: [Weighted random sample in python](http://stackoverflow.com/a/13052108/4279) – jfs Mar 22 '15 at 15:51

3 Answers3

5

You can just permute a list. Lets say you create the list

x = list('A'*60 + 'B'*30 + 'C'*10)

Then, you can shuffle this in-place like so:

from random import shuffle
shuffle(x)
ssm
  • 5,277
  • 1
  • 24
  • 42
3

Something like that if distributions should be uniform, A will on average occur in 60% of cases, and so other values

import random
res = []
for i in range(0, n_samples):
   r = random.random()
   if(r<=0.6): res.append(A)
   elif(r>0.7): res.append(B)
   elif(r>0.6 and r<=0.7): res.append(C)
smci
  • 32,567
  • 20
  • 113
  • 146
  • A tiny corner case, but if r was either exactly 0.6 or .7 your if..elif ladder would have done nothing. I added "<=" signs. – smci Mar 22 '15 at 04:31
  • *r* will never be exactly 0.6, probability of that is formally 0 (although I know what you mean and you are right) – aaaaa says reinstate Monica Mar 22 '15 at 05:43
  • 1
    you could write combined conditions in Python: `0.6 <= r < 0.7` e.g.: `res.append('A' if r < 0.6 else 'B' if 0.6 <= r < 0.9 else 'C')` – jfs Mar 22 '15 at 15:39
3

If you want exactly 60% to be A, 30% B and 10% C and you know there have to be 100 elements in total, you can do something like the following:

import random

num = 100
prob_a = 0.6
prob_b = 0.3
prob_c = 0.1

As = int(num*prob_a) * 'A'
Bs = int(num*prob_b) * 'B'
Cs = int(num*prob_c) * 'C'

# create a list with 60 As, 30 Bs, and 10 Cs
chars = list(As + Bs + Cs)
random.shuffle(chars)

print("".join(chars))

That'll output something like BAAAAABBCBAABABAAAACAABBAABACAACBAACBBBAAACBAAAABAAABABAAAAABBBABAABAABAACCAABABAAAAAACABBBBCABAAAAA

avacariu
  • 2,780
  • 3
  • 25
  • 25