12

How to get the string between two points using regex or any other library in Python 3?

For eg: Blah blah ABC the string to be retrieved XYZ Blah Blah

ABC and XYZ are variables which denote the start and end of the string which I have to retrieve.

sgp
  • 1,738
  • 6
  • 17
  • 31

2 Answers2

15

Use ABC and XYZ as anchors with look-behind and look-ahead assertions:

(?<=ABC).*?(?=XYZ)

The (?<=...) look-behind assertion only matches at the location in the text that was preceded by ABC. Similarly, (?=XYZ) matches at the location that is followed by XYZ. Together they form two anchors that limit the .* expression, which matches anything.

You can find all such anchored pieces of text with re.findall():

for matchedtext in re.findall(r'(?<=ABC).*?(?=XYZ)', inputtext):

If ABC and XYZ are variable, you want to use re.escape() (to prevent any of their content from being interpreted as regular expression syntax) on them and interpolate:

re.match(r'(?<={}).*?(?={})'.format(abc, xyz), inputtext)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
5

I think this is what you want:

import re
match = re.search('ABC(.*)XYZ','Blah blah ABC the string to be retrieved XYZ Blah Blah')
print match.group(1)
jcrudy
  • 3,921
  • 1
  • 24
  • 31