I am compiling a regex pattern from a list (a long list). And then applying that to extract some match. Now the problem is, when there is especial symbol in the element of the list, I cannot compile that into a regex pattern. Can someone please shade some lights on this?
For example, the following I tried will work as long as they symbol ")" was not introduced. But my list will have many elements with different symbols.
# my list
my_tokens = ["my test 1", "my test 2", "my test 3", "many large test n", "my test X"]
# List with token ) at the end, the last one -- does not work
#my_tokens = ["my test 1", "my test 2", "my test 3", "many large test n", "my test )"]
reg = r'\b(%s|\w+)\b' % '|'.join(my_tokens)
my_test_sentence = "my test 1 and my test 3 and so on my test X"
for token in re.finditer(reg, my_test_sentence):
print(token.group())
Thank you in advance!