0

I am trying to make a script that reads the list element sequentially and concatenate them each into regex pattern.

for example,

I have a list like this:

lst = ['B.', 'Article', 'III']

And want to comprise regex something like this:

re.search(r lst[0]\s+lst[1]\s+lst[2]).group()

so that it can match below regardless of white_spaces between each elements from the list:

candidate_1 = 'B.      Article III'
candidate_2 = 'B.        Article III'
snapper
  • 997
  • 1
  • 12
  • 15

1 Answers1

3

Try str.join(), like so:

r'\s+'.join(lst)

Here is a complete program:

import re

def list2pattern(l):
    return r'\s+'.join(l)

lst = ['B.', 'Article', 'III']
assert re.search(list2pattern(lst), 'B. Article III')
assert re.search(list2pattern(lst), 'B.      Article III')
assert not re.search(list2pattern(lst), 'B.Article III')
assert not re.search(list2pattern(lst), 'George')
Robᵩ
  • 163,533
  • 20
  • 239
  • 308
  • I have more fundamental question. I have tried type(list2pattern(lst)) and it returns str, where my question is then, how the machine distinguish between str and r str if they are all memorized as str? – snapper Feb 21 '18 at 07:06
  • @delinco: What are "str" and "r str"? Ah, probably you mean `r""`: https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals. – CristiFati Feb 21 '18 at 07:09
  • the r that always added before expressing regex pattern – snapper Feb 21 '18 at 07:12
  • @delinco - See this question: https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-and-what-are-raw-string-literals – Robᵩ Feb 22 '18 at 03:23