full code: https://gist.github.com/QuantVI/79a1c164f3017c6a7a2d860e55cf5d5b
TLDR: sum(a3)
gives a number like 770, when it should be more like 270 - as in 270 of 1000 trials where the results of drawing 4 contained (at least) 2 blue and 1 green ball.
I've rewritten both my way of creating the sample output, and my way of comparing the results twice already. Python as a syntax `all(x in a for x n b)` which I used initially, then change to something more deliberate to see if there was a change. I still have 750+ `True` evaluations of each trial. This is why I reassessed how I was selecting without replacement.
I've tested the draw
function on its own with different Hat
s and was sure it worked.
The expected probability when drawing 4balls, without replacement, from a hat containing (blue=3,red=2,green=6), and having the outcome contain (blue=2,green=1) or ['blue','blue','green'] is around 27.2%. In my 1000 trials, I get higher then 700, repeatedly.
Is the error in Hat.draw()
or is it in experiment()
?
Note: Certain things are commented out, because I am debugging. Thus use sum(a3) as experiment
is commented out to return things other than the probability right now.
import copy
import random
# Consider using the modules imported above.
class Hat:
def __init__(self, **kwargs):
self.d = kwargs
self.contents = [
key for key, val in kwargs.items() for num in range(val)
]
def draw(self, num: int) -> list:
if num >= len(self.contents):
return self.contents
else:
indices = random.sample(range(len(self.contents)), num)
chosen = [self.contents[idx] for idx in indices]
#new_contents = [ v for i, v in enumerate(self.contents) if i not in indices]
new_contents = [pair[1] for pair in enumerate(self.contents)
if pair[0] not in indices]
self.contents = new_contents
return chosen
def __repr__(self): return str(self.contents)
def experiment(hat, expected_balls, num_balls_drawn, num_experiments):
trials =[]
for n in range(num_experiments):
copyn = copy.deepcopy(hat)
result = copyn.draw(num_balls_drawn)
trials.append(result)
#trials = [ copy.deepcopy(hat).draw(num_balls_drawn) for n in range(num_experiments) ]
expected_contents = [key for key, val in expected_balls.items() for num in range(val)]
temp_eval = [[o for o in expected_contents if o in trial] for trial in trials]
temp_compare = [ evaled == expected_contents for evaled in temp_eval]
return expected_contents,temp_eval,temp_compare, trials
#evaluations = [ all(x in trial for x in expected_contents) for trial in trials ]
#if evaluations: prob = sum(evaluations)/len(evaluations)
#else: prob = 0
#return prob, expected_contents
#hat3 = Hat(red=5, orange=4, black=1, blue=0, pink=2, striped=9)
#hat4 = Hat(red=1, orange=2, black=3, blue=2)
hat1 = Hat(blue=3,red=2,green=6)
a1,a2,a3,a4 = experiment(hat=hat1, expected_balls={"blue":2,"green":1}, num_balls_drawn=4, num_experiments=1000)
#actual = probability
#expected = 0.272
#self.assertAlmostEqual(actual, expected, delta = 0.01, msg = 'Expected experiment method to return a different probability.')
hat2 = Hat(yellow=5,red=1,green=3,blue=9,test=1)
b1,b2,b3,b4 = experiment(hat=hat2, expected_balls={"yellow":2,"blue":3,"test":1}, num_balls_drawn=20, num_experiments=100)
#actual = probability
#expected = 1.0
#self.assertAlmostEqual(actual, expected, delta = 0.01, msg = 'Expected experiment method to return a different probability.')