Generate the candidate values for each possible position - even if there is only one candidate for most positions - then create a Cartesian product of those values.
In the OP's example, the candidates are ['x', '5']
for any position where an 'x'
appears in the input; for each other position, the candidates are a list with a single possibility (the original letter). Thus:
def candidates(letter):
return ['x', '5'] if letter == 'x' else [letter]
Then we can produce the patterns by producing a list of candidates for positions, using itertools.product
, and combining them:
from itertools import product
def combine(candidate_list):
return ''.join(candidate_list)
def patterns(data):
all_candidates = [candidates(element) for element in data]
for result in product(*all_candidates):
yield combine(result)
Let's test it:
>>> list(patterns('1xxx1'))
['1xxx1', '1xx51', '1x5x1', '1x551', '15xx1', '15x51', '155x1', '15551']
Notice that the algorithm in the generator is fully general; all that varies is the detail of how to generate candidates and how to process results. For example, suppose we want to replace "placeholders" within a string - then we need to split the string into placeholders and non-placeholders, and have a candidates
function that generates all the possible replacements for placeholders, and the literal string for non-placeholders.
For example, with this setup:
keywords = {'wouldyou': ["can you", "would you", "please"], 'please': ["please", "ASAP"]}
template = '((wouldyou)) give me something ((please))'
First we would split the template, for example with a regular expression:
import re
def tokenize(t):
return re.split(r'(\(\(.*?\)\))', t)
This tokenizer will give empty strings before and after the placeholders, but this doesn't cause a problem:
>>> tokenize(template)
['', '((wouldyou))', ' give me something ', '((please))', '']
To generate replacements, we can use something like:
def candidates(part):
if part.startswith('((') and part.endswith('))'):
return keywords.get(part[2:-2], [part[2:-2]])
else:
return [part]
That is: placeholder-parts are identified by the parentheses, stripped of those parentheses, and looked up in the dictionary.
Trying it with the other existing definitions:
>>> list(patterns(tokenize(template)))
['can you give me something please', 'can you give me something ASAP', 'would you give me something please', 'would you give me something ASAP', 'please give me something please', 'please give me something ASAP']
To generalize patterns
properly, rather than depending on other global functions combine
and candidates
, we should use dependency injection - by simply passing those as parameters which are higher-order functions. Thus:
from itertools import product
def patterns(data, candidates, combine):
all_candidates = [candidates(element) for element in data]
for result in product(*all_candidates):
yield combine(result)
Now the same core code solves whatever problem. Examples might look like:
def euler_51(s):
for pattern in patterns(
s,
lambda letter: ['x', '5'] if letter == 'x' else [letter],
''.join
):
print(pattern)
euler_51('1xxx1')
or
def replace_in_template(template, replacement_lookup):
tokens = re.split(r'(\(\(.*?\)\))', template)
return list(patterns(
tokens,
lambda part: (
keywords.get(part[2:-2], [part[2:-2]])
if part.startswith('((') and part.endswith('))')
else [part]
),
''.join
))
replace_in_template(
'((wouldyou)) give me something ((please))',
{
'wouldyou': ["can you", "would you", "please"],
'please': ["please", "ASAP"]
}
)