import regex
product_detail = "yyy target1 target2 xxx".lower()
p1 = r"\btarget1\b|\btarget1 target2\b"
p2 = r"\btarget2\b|\btarget1 target2\b"
for pattern in [p1, p2]:
matches = regex.findall(pattern, product_detail, overlapped=True)
print(matches)
why does matches from p1 only give ['target1']
as output, without 'target1 target2'
but matches from p2 can successfully give ['target1 target2', 'target2']
as output.
Also if you can provide a fix, how do i generalise it? i have a list of 10000 target words and its not going to be feasible to hardcode them.