Edit after slothrop's comment:
It seems that takewhile eats up the last checked object in iterators (see comment) which means there is no telling if the next element will also satisfy the condition, which makes calculation problematic. Using takewhile on an iterator of infinitely many random realisations seems to be not the best idea (if the realisations have to stay in tact).
End Edit
Problem:
I wrote a monte carlo simulation twice, once as itertool implementation, once in an orthodox style. The itertool one has some weird behaviour and from what I can tell not the right results.... the other one should work fine...
I wonder why this is, apolegies for the large code.
Idea:
My guess is that the takewhile function has some weird behaviour when mixed with RVs.
Notice in the simulate_freqs_iter function there are other specifications commented out, (e.g. zip(takewhile(lambda x: x < t, honest_arrival_times), attack_arrival_times)
) they yield significantly different results still, which I cannot explain to myself...
Before you ask - I'm aware that n=1e5 is too small, I was aiming for n=1e8 or n=1e9, but I won't expect you to run the programm for 20h. With 1e5 it will be done in 10 seconds or so, and the difference is already very stark!
I would be very greatful for an anwer as I have fallen in love with itertools a bit too much (the itertool implementation is not even faster or efficient in any other way...).
Code:
from random import expovariate, seed
from itertools import accumulate, takewhile, count, islice, chain
seed("31.05.2023")
def simulate_freqs_iter(k:int, mu:float, lam:float, n:int, conditional_on_finite:bool=True, verbose:bool=False) -> dict:
"""Simulates n monte-carlo runs of a slow poisson process catching up with a fast one
returns a dictionary containing the empirical cdf for t={10, 11, ..., 50}"""
abs_freqs = {t: 0 for t in range(10, 51)}
for i in range(n):
# Waiting times for each poisson process
honest_arrival_times = accumulate(((expovariate(mu)) for _ in count(start=0, step=1)))
attack_arrival_times = islice( # attack_arrival_times is k blocks behind, but only needs to catch up, not surpass
accumulate(((expovariate((lam))) for _ in count(start=0, step=1))), k-1, None)
# Optional printing function
if verbose and (i%50000)==0:
first_10_honest = list(islice(honest_arrival_times, 10))
first_10_attack = list(islice(attack_arrival_times, 10))
print(list(zip(first_10_honest, first_10_attack)))
honest_arrival_times = chain(first_10_honest, honest_arrival_times)
attack_arrival_times = chain(first_10_attack, attack_arrival_times)
# Simulates poisson waiting times and counts number of hits where 1 catches up with 2
for t in abs_freqs.keys():
# Every time N_2 surpasses N_1 we have a hit
hits = list((t, honest, attack) for honest, attack in
# Take all jump times of N_1 and see if N_2 was there before N_1 within t
# takewhile(lambda x: min(x[0], x[1]) < t, zip(honest_arrival_times, attack_arrival_times))
# zip(takewhile(lambda x: x < t, honest_arrival_times), attack_arrival_times)
zip(honest_arrival_times, takewhile(lambda x: x < t, attack_arrival_times))
# zip(takewhile(lambda x: x < t, honest_arrival_times), takewhile(lambda x: x < t, attack_arrival_times))
if honest > attack)
if hits:
if verbose: print(hits[0])
abs_freqs[t]+=1
# Don't continue this path once found a hit
break
# Condition on the probability that it ever catches up
prob_finite = min(((lam/mu)**k), 1) if conditional_on_finite else 1
# Accumulate relative frequencies to get cdf
results = {key:((value/n) / prob_finite) for key, value in dict(
accumulate(abs_freqs.items(), lambda prev,curr: (curr[0], (curr[1]+prev[1])))).items()}
return results
def simulate_freqs_orthodox(k:int, mu:float, lam:float, n:int, conditional_on_finite:bool=True, verbose:bool=False) -> dict:
"""Simulates n monte-carlo runs of a slow poisson process catching up with a fast one
returns a dictionary containing the empirical cdf for t={10, 11, ..., 50}"""
abs_freqs = {t: 0 for t in range(10, 51)}
# We run the monte carlo n times
for _ in range(n):
# Initialise first block arrival
next_honest_arrival = expovariate(mu)
next_attack_arrival = expovariate(lam)
# the attacker chain is behind k blocks
for _ in range(k-1):
next_attack_arrival += expovariate(lam)
for t in abs_freqs.keys():
while next_attack_arrival < t:
# count when a double spend attack is successful
if next_honest_arrival > next_attack_arrival:
abs_freqs[t] += 1
if verbose: print(next_honest_arrival, next_attack_arrival)
# stop searching for all t on this path once found a hit
break
# Add another block to each chain
next_honest_arrival += expovariate(mu)
next_attack_arrival += expovariate(lam)
else:
# If while-loop terminates without a hit, continue with greater t
continue
# Terminate search for this path after a successful hit
break
# Condition on the probability that it ever catches up
prob_finite = min(((lam/mu)**k), 1) if conditional_on_finite else 1
# Accumulate relative frequencies to get cdf
results = {key:((value/n) / prob_finite) for key, value in dict(
accumulate(abs_freqs.items(), lambda prev,curr: (curr[0], (curr[1]+prev[1])))).items()}
return results
# Monte Carlo simulation of probabilities for T_k
k=10
mu=7/10
lam=3/10
n=int(1e5)
print(simulate_freqs_iter(k=k, mu=mu, lam=lam, n=n, conditional_on_finite=True, verbose=False))
print(simulate_freqs_orthodox(k=k, mu=mu, lam=lam, n=n, conditional_on_finite=True, verbose=False))