7
>>> s = 'foo: "apples", bar: "oranges"'
>>> pattern = 'foo: "(.*)"'

I want to be able to substitute into the group like this:

>>> re.sub(pattern, 'pears', s, group=1)
'foo: "pears", bar: "oranges"'

Is there a nice way to do this?

Peter Graham
  • 11,323
  • 7
  • 40
  • 42
  • 2
    Your pattern uses the greedy operator `.*`, meaning it will get the longest match it can find, which means that in your case the group will be `apples", bar: "oranges`. You're looking for `(.*?)` – abyx Jun 17 '10 at 06:38

2 Answers2

10

For me works something like:

rx = re.compile(r'(foo: ")(.*?)(".*)')
s_new = rx.sub(r'\g<1>pears\g<3>', s)
print(s_new)

Notice ?in re, so it ends with first ", also notice " in groups 1 and 3 because they must be in output.

Instead of \g<1> (or \g<number>) you can use just \1, but remember to use "raw" strings and that g<1> form is preffered because \1 could be ambiguous (look for examples in Python doc) .

Michał Niklas
  • 53,067
  • 18
  • 70
  • 114
0
re.sub(r'(?<=foo: ")[^"]+(?=")', 'pears', s)

The regex matches a sequence of chars that

  • Follows the string foo: ",
  • doesn't contain double quotation marks and
  • is followed by "

(?<=) and (?=) are lookbehind and lookahead

This regex will fail if the value of foo contains escaped quots. Use the following one to catch them too:

re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)

Sample code

>>> s = 'foo: "apples \\\"and\\\" more apples", bar: "oranges"'
>>> print s
foo: "apples \"and\" more apples", bar: "oranges"
>>> print   re.sub(r'(?<=foo: ")(\\"|[^"])+(?=")', 'pears', s)
foo: "pears", bar: "oranges"
Community
  • 1
  • 1
Amarghosh
  • 58,710
  • 11
  • 92
  • 121