How to create a list containing all strings between 2 identical patterns

Question

Given a string as an example below:
string = 'a bcde:Title - 1 xyz;dummy-a bcde:Title - 2.1 xyz;dummy-a bcde:Title - 3.1 xyz;dummy-' My interesting content is between 'a bcde:' and ' xyz' , so in this case I would like to get these strings (Title - 1,Title - 2.1,Title - 3.1) out and create a list.

# following is the code
string = 'a bcde:Title - 1 xyz;dummy-a bcde:Title - 2.1 xyz;dummy-a bcde:Title - 3.1 xyz;dummy-'
start = 'a bcde:'
end = ' xyz'
n = [1,2,3]
title_list = []
for index in n:
    title = (string.split(start))[index].split(end)[0]
    title_list.append(title)
print(title_list)

With the current code, it works as expected, because the string is short enough, I could define occurrence (n = [1,2,3]). While the string is too big to count then I start to have a problem.I am looking for ways that are more efficient and explicitly. I expect to create a string list containing any info between start & end patterns as shown below: ['Title - 1', 'Title - 2.1', 'Title - 3.1',....]

Thanks !

FObersteiner · Accepted Answer · 2019-11-01T12:10:07.023

1

have a look at regex; see e.g. here. you could do

import re

string = 'a bcde:Title - 1 xyz;dummy-a bcde:Title - 2.1 xyz;dummy-a bcde:Title - 3.1 xyz;dummy-'

print(re.findall(r'a bcde:(.*?) xyz', string))
# ['Title - 1', 'Title - 2.1', 'Title - 3.1']

or a bit more versatile as a function:

def match_between(s, p0, p1):
    expr = re.compile(p0 + r'(.*?)' + p1)
    return re.findall(expr, string)

patterns = (r'a bcde:', r' xyz')
print(match_between(string, *patterns))
# ['Title - 1', 'Title - 2.1', 'Title - 3.1']

edited Nov 01 '19 at 12:10

answered Nov 01 '19 at 11:59

FObersteiner

22,500
8
42
72

Thank you, this is what I am looking for :) – Ken Nov 01 '19 at 12:12
great! by the way, to construct and test regex patterns, I find this site pretty useful: https://regex101.com/ – FObersteiner Nov 01 '19 at 12:13

How to create a list containing all strings between 2 identical patterns

1 Answers1