Regex Pattern Comprising From the Elements in List

Question

I am trying to make a script that reads the list element sequentially and concatenate them each into regex pattern.

for example,

I have a list like this:

lst = ['B.', 'Article', 'III']

And want to comprise regex something like this:

re.search(r lst[0]\s+lst[1]\s+lst[2]).group()

so that it can match below regardless of white_spaces between each elements from the list:

candidate_1 = 'B.      Article III'
candidate_2 = 'B.        Article III'

@Robᵩ I don't know how to concatenate r with str. if I do r lst[0]\s+lst[1]\s+lst[2] it reutrns error, and if I do rlst[0]\s+lst[1]\s+lst[2] it also definitely returns error — snapper, Feb 21 '18 at 07:02

score 3 · Accepted Answer · answered Feb 21 '18 at 07:00

3

Try str.join(), like so:

r'\s+'.join(lst)

Here is a complete program:

import re

def list2pattern(l):
    return r'\s+'.join(l)

lst = ['B.', 'Article', 'III']
assert re.search(list2pattern(lst), 'B. Article III')
assert re.search(list2pattern(lst), 'B.      Article III')
assert not re.search(list2pattern(lst), 'B.Article III')
assert not re.search(list2pattern(lst), 'George')

answered Feb 21 '18 at 07:00

Robᵩ

163,533
20
239
308

I have more fundamental question. I have tried type(list2pattern(lst)) and it returns str, where my question is then, how the machine distinguish between str and r str if they are all memorized as str? – snapper Feb 21 '18 at 07:06
@delinco: What are "str" and "r str"? Ah, probably you mean `r""`: https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals. – CristiFati Feb 21 '18 at 07:09
the r that always added before expressing regex pattern – snapper Feb 21 '18 at 07:12
@delinco - See this question: https://stackoverflow.com/questions/2081640/what-exactly-do-u-and-r-string-flags-do-and-what-are-raw-string-literals – Robᵩ Feb 22 '18 at 03:23

Regex Pattern Comprising From the Elements in List

1 Answers1