0

I would like to check/test the print probabilities of the three variables (separate and taken individually),

for example i would like to take many random draws and count the frequency of each value, or something similar. How can I?

import random

a = "Word A1", "Word A2", "Word A3", "Word A4"

b = "Word B1", "Word B2", "Word B3", "Word B4"

c = "Word C1", "Word C2", "Word C3", "Word C4"


a_random = random.choice(a)
b_random = random.choice(b)
c_random = random.choice(c)

sentence = a_random + ", " + b_random + ", " + c_random

print(sentence)
  • 1
    "It often happens to me that the same variable is printed even 2 or 3 consecutive times." - a common human cognitive bias is to underestimate the probability of identical consecutive draws from a truly random distribution. In fact, game developers sometimes make their "random" numbers deliberately correlated (not fully random) because players have this cognitive bias and see truly random processes as unfair. See: https://gamedev.stackexchange.com/questions/62547/make-a-fake-random-distribution – slothrop Apr 02 '23 at 21:04
  • Anyway, to test it out you could take many random draws and use a `collections.Counter` to count the frequency of each value: https://realpython.com/python-counter/ – slothrop Apr 02 '23 at 21:05
  • @slothrop I read your suggestion, but I understood little. Could you show me how to do this in an answer please? (using my code of course). Thank you – Evangelos Dellas Apr 02 '23 at 21:08
  • If you want to know the mechanism behind the random number generation, the [doc](https://docs.python.org/3/library/random.html) says it makes use of the [Mersenne Twister](https://en.m.wikipedia.org/wiki/Mersenne_Twister) algorithm, so you could read up on that. – B Remmelzwaal Apr 02 '23 at 21:10
  • @BRemmelzwaal I expressed myself badly, I apologize. I meant I would need something similar to how user slothrop suggested. Could you help me please? Thank you – Evangelos Dellas Apr 02 '23 at 21:11
  • The implementation of a pseudo-pseudo-random number generation (since `random` generates pseudo-randomly) depends on what you want the output to be. For instance, how often would you want three identical values to appear? And how often three distinct values? From there you can start working on how you would implement that. – B Remmelzwaal Apr 02 '23 at 21:14
  • _for example i would like to take many random draws and count the frequency of each value_ Make a for loop that runs many times (say 1000 or more), inside the loop make a random selection from the choices, keep track of how many times each choice is made, and print the totals for each choice after the loop. What is the difficulty? – John Gordon Apr 02 '23 at 21:33
  • @BRemmelzwaal I didn't know that I could implement these things "how often would you want three identical values to appear? And how often three distinct values?". Thanks for the food for thought. I would like the same variable (e.g. a_random) to never print the same value 2 times in a row.For example I wouldn't want you to print Word A3 and then next time Word A3 again. Also I would like the value of the same variable to be printed only after five tries. For example, if I print Word A3, then the next Word A3 should print after "minimum" five random print attempts. Can you help me? Thank you – Evangelos Dellas Apr 02 '23 at 21:35

3 Answers3

3

for example i would like to take many random draws and count the frequency of each value, or something similar. How can I?

The simplest way to do that would be to write a loop and count how often each random draw occured. Example:

import random
from collections import Counter

a = "Word A1", "Word A2", "Word A3", "Word A4"

b = "Word B1", "Word B2", "Word B3", "Word B4"

c = "Word C1", "Word C2", "Word C3", "Word C4"

a_counter = Counter()
for i in range(1000):
    a_random = random.choice(a)
    a_counter[a_random] += 1

b_counter = Counter()
for i in range(1000):
    b_random = random.choice(b)
    b_counter[b_random] += 1

c_counter = Counter()
for i in range(1000):
    c_random = random.choice(c)
    c_counter[c_random] += 1

print(a_counter)
print(b_counter)
print(c_counter)

The output shows how many times each word is selected.

Counter({'Word A1': 252, 'Word A4': 251, 'Word A2': 251, 'Word A3': 246})
Counter({'Word B1': 265, 'Word B4': 265, 'Word B3': 250, 'Word B2': 220})
Counter({'Word C1': 266, 'Word C3': 264, 'Word C4': 236, 'Word C2': 234})
Nick ODell
  • 15,465
  • 3
  • 32
  • 66
  • I understand that it's all random, but for example between "Word B2': 220" and "Word B1': 265" (or others) there is a lot of difference. Strange! Is it possible to customize the randomization? Then add preferences in variable printing frequency? – Evangelos Dellas Apr 02 '23 at 21:40
  • 3
    this difference is to be expected. have a look at the [multinomial distribution](https://en.wikipedia.org/wiki/Multinomial_distribution) (or more simply at the binomial distribution). you should expect to see values more than 222, and less than 277, 95% of the time (i.e. 250 ± 27). the variance "shrinks" as you perform more draws, you're currently seeing variance of 10% of the mean (27 / 250) but if you try 100_000 draws you'd expect to see variance drop to ~1% – Sam Mason Apr 02 '23 at 22:01
  • @EvangelosDellas Well, it's random. Some deviation from the mean is to be expected. By default all choices are equally probable, but you can change that by using a [weighted random choice](https://stackoverflow.com/questions/3679694/a-weighted-version-of-random-choice) to make some options more probable. – Nick ODell Apr 02 '23 at 22:09
  • @NickODell Thank you. I accepted and upvoted your answer. To stay on topic in the comments, could you help me with this question as well? The code is the same, I ask about custom randomization: https://stackoverflow.com/questions/75914939/is-it-possible-to-customize-the-random-function-to-avoid-too-much-repetition-of – Evangelos Dellas Apr 02 '23 at 22:35
0

How to examine the properties of repeated randomness

I think you are puzzled by why the sequence of calls to the random number generator does not systematically cycle through all the options, one after the other?

If it did, it wouldn't be random.

Try code like this, which shows you what happens on 4 calls of a random word A alone. I have called the 4 versions of word A, "P", "Q", "R", "S", for brevity, to avoid repeating "word_A" and to avoid using digits (which could be confused with the frequencies).

It runs it thousands times and tabulates how frequent each sequence of word A is.

  • Which patterns come the most frequently?

  • Is PPPP more common than PQRS? (Try running the program many times)

  • If so, why; if not, why not?

  • In what proportion of cases does "P" not show up at all.

  • In what proportion of cases is the second symbol the same as the first?

The study of the answers to these is the basis of probability and statistics.

import random

n_repetitions = 10000
n_options = 4
length = 4

histo = {}
for i in range(n_repetitions):
    string = ""
    for character in range(length):
      string += chr(ord("O")+random.randint(1,n_options))
    if string not in histo:
       histo[string] =0
    histo[string]+=1

keys = sorted(histo.keys())
print("Seq  Frequency")
for key in keys:
  print (key, histo[key]) 

Example output

But each run is different!

Seq  Frequency
PPPP 48
PPPQ 36
PPPR 39
PPPS 47
PPQP 34
PPQQ 43
PPQR 44
PPQS 39
PPRP 32
PPRQ 36
PPRR 42
PPRS 36
PPSP 36
PPSQ 33
PPSR 38
PPSS 29
PQPP 38
PQPQ 30
PQPR 36
... etc, to SSSS
ProfDFrancis
  • 8,816
  • 1
  • 17
  • 26
0

I would do something like this (one liner!):

import numpy as np

# Change these as needed
groups = [["Word A1", "Word A2", "Word A3", "Word A4"], 
          ["Word B1", "Word B2", "Word B3", "Word B4"],
          ["Word C1", "Word C2", "Word C3", "Word C4"]]
m = 100

# Choose 1 word from each group of words (of which there are n), m times 
results = np.array([[np.random.randint(0, len(group)) for group in groups] for _ in range(m)])

This will produce an m x n array of choices. For example, one iteration of your example might produce an array [1,2,0], which would correspond to the sentence "Word A2, Word B3, Word C1". To find out how many times Word A2 had been selected, you could simply use np.unique:

import numpy as np

# keep in mind that word_a2_index = 1

unique, counts = np.unique(results[:,1], return_counts=True)

dict(zip(unique, counts)) # {0: 7, 1: 4, 2: 1, 3: 2, 4: 1}

Or to reconstruct sentences:

def reconstruct(row):
    ", ".join([groups[i][row[i]] for i in range(len(row))])

np.apply_along_axis(reconstruct, axis=1, results)
v0rtex20k
  • 1,041
  • 10
  • 20