1

I am trying to split the regex patterns across multiple lines, but it seems to pick up only the pattern specified in the last line. Below example illustrates the problem :

>>> o = re.compile(r'\btext1\b\
... |\btext2\b\
... |\btext3\b')
>>> print o.search(x)
None
>>> x
'text1'
>>> x = 'text3'
>>> print o.search(x)
<_sre.SRE_Match object at 0x025E4CD0>
>>> x = 'text2'
>>> print o.search(x)
None

How can I write this line across multiple lines :

>>> o = re.compile(r'\btext1\b|\btext2\b|\btext3\b')
sarbjit
  • 3,786
  • 9
  • 38
  • 60
  • Check this answer [pythonic way to create a long multi-line string]: http://stackoverflow.com/questions/10660435/pythonic-way-to-create-a-long-multi-line-string – Don Oct 14 '13 at 13:27

2 Answers2

3

Use re.VERBOSE (or re.X) flag.

Or put (?x) inside the regular expression.

>>> import re
>>> o = re.compile(r'''
... \btext1\b |
... \btext2\b |
... \btext3\b
... ''', flags=re.VERBOSE)
>>> o.search('text1')
<_sre.SRE_Match object at 0x0000000001E58578>
>>> o.search('text2')
<_sre.SRE_Match object at 0x0000000002633370>
>>> o.search('text3')
<_sre.SRE_Match object at 0x0000000001E58578>
>>> o.search('text4')
>>>
falsetru
  • 357,413
  • 63
  • 732
  • 636
0

If you use \ to continue a string at the end of the line in the source code, the newline placed there will be part of that string.

I propose to use one of these syntaxes instead:

o = re.compile(r'\btext1\b'
               r'|\btext2\b'
               r'|\btext3\b')

or

o = re.compile(r'\btext1\b|\btext2\b|\btext3\b')

or use the re.VERBOSE flag as @falsetru proposed in his answer to be able to insert whitespace characters (like newlines) which will be skipped by the regexp pattern parser when compiling your pattern.

Debugging hint: You can output o.pattern:

print o.pattern

to inspect the pattern the compiled regexp is based on. This would have shown you the problem in your case.

Alfe
  • 56,346
  • 20
  • 107
  • 159