For a research project, we are currently using the Gamblers Ruin Algorithm to produce random postfix expressions (terms) using term variables "x", "y", and "z" and an operator "*", as shown in the method below:
def gamblers_ruin_algorithm(prob=0.3,
min_term_length=1,
max_term_length=None):
"""
Generate a random term using the gamblers ruin algorithm
:type prop: float
:param prob: Probability of growing the size of a random term
:type max_term_length: int
:param max_term_length: Maximum length of the generated term
"""
term_variables = ["x", "y", "z"]
substitutions = ("EE*", "I")
term = "E"
term_length = 0
# randomly build a term
while("E" in term):
rand = uniform(0, 1)
if rand < prob or term_length < min_term_length:
index = 0
term_length += 1
else:
index = 1
if (max_term_length is not None and
term_length >= max_term_length):
term = term.replace("E", "I")
break
term = term.replace("E", substitutions[index], 1)
# randomly replace operands
while("I" in term):
term = term.replace("I", choice(term_variables), 1)
return term
The following method will produce a random postfix term like the following:
xyz*xx***zyz*zzx*zzyy***x**x****x**
This issue with this method is that when run thousands of times, it tends to frequently produce duplicate expressions.
Is there a different algorithm for producing random posfix expressions of an arbitrary length that minimizes the probability of producing the same expression more than once?