1

Using the regex in python, I want to compile a string that gets the pattern "\1" up to "\9". I've tried

regex= re.compile("\\(\d)")  #sre_constants.error: unbalanced parenthesis
regex= re.compile("\\\(\d)") #gets \\4 but not \4

but to no avail..

Any thoughts?

Rawr Rawr
  • 23
  • 3

3 Answers3

3

One more: re.compile("\\\\(\\d)"). Or, a better option, a raw string: re.compile(r"\\(\d)").

The reason is the fact that backslash has meaning in both a string and in a regexp. For example, in regexp, \d is "a digit"; so you can't just use \ for a backslash, and backslash is thus \\. But in a normal string, \" is a quote, so a backslash needs to be \\. When you combine the two, the string "\\\\(\\d)" actually contains \\(\d), which is a regexp that matches \ and a digit.

Raw strings avoid the problem up to a point by giving backslashes a different and much more restricted semantics.

Amadan
  • 191,408
  • 23
  • 240
  • 301
2

You should use a raw-string (which does not process escape sequences):

regex= re.compile(r"\\(\d)")
0

Use raw string:

regex= re.compile(r"\\(\d)")
Marcin
  • 215,873
  • 14
  • 235
  • 294