I am simulating a infinite sequence of die rolls to calculate the average "hitting time" for a sequence. In this particular case I am looking for the first occurrence of a "11" or a "12". For example in "34241213113..." the first occurrence of "12 is at time 6 and that of "11" is at time 10. Here is my python code.
import numpy as np
NN=1000000
t11=np.zeros(NN)
t12=np.zeros(NN)
for i in range(NN):
prev=np.random.randint(1,7)
flag11=True
flag12=True
ctr=2
while flag11 or flag12:
curr=np.random.randint(1,7)
if flag11 and prev==1 and curr==1:
t11[i]=ctr
flag11=False
if flag12 and prev==1 and curr==2:
t12[i]=ctr
flag12=False
ctr=ctr+1;
prev=curr
print('Mean t11: %f' %(np.mean(t11)))
print('\nMean t12: %f' %(np.mean(t12)))
As soon as both the sequences have been observed, we start a new sample. It takes about a million sample paths before the expected values converge to the theoretical ones (42 for "11" and 36 for "12"). And the code takes about a minute to run. I am new to python, and have been using it for just about a month.
I was wondering if there was a way to speed up the code, maybe a different algorithm, or maybe optimizing the routines? Would it have notably different performance on a compiled language vs. an interpreted language? I am