15

This is a simple example:

import re

math='<m>3+5</m>'
print re.sub(r'<(.)>(\d+?)\+(\d+?)</\1>', int(r'\2') + int(r'\3'), math)

It gives me this error:

ValueError: invalid literal for int() with base 10: '\\2'

It sends \\2 instead of 3 and 5.

Why? How do I solve it?

Stefan van den Akker
  • 6,661
  • 7
  • 48
  • 63
user1586464
  • 247
  • 3
  • 7
  • 1
    Possible duplicate of [Python replace string pattern with output of function](http://stackoverflow.com/questions/12597370/python-replace-string-pattern-with-output-of-function) – thakis Feb 15 '16 at 22:13

2 Answers2

27

If you want to use a function with re.sub you need to pass a function, not an expression. As documented here, your function should take the match object as an argument and returns the replacement string. You can access the groups with the usual .group(n) methods and so on. An example:

re.sub("(a+)(b+)", lambda match: "{0} as and {1} bs ".format(
    len(match.group(1)), len(match.group(2))
), "aaabbaabbbaaaabb")
# Output is '3 as and 2 bs 2 as and 3 bs 4 as and 2 bs '

Note that the function should return strings (since they will be put back into the original string).

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
7

You need to use lambda function.

print re.sub(r'<(.)>(\d+?)\+(\d+?)</\1>', lambda m: str(int(m.group(2)) + int(m.group(3))), math)
xdazz
  • 158,678
  • 38
  • 247
  • 274