1

I'm working on a problem in Python in which I need to search and replace for a certain character everywhere in a string except when it is located between curly braces. I know how to do this when the character is between the braces, but not when is it located outside the braces. Essentially, I want the search to skip over anything between two delimiters.

My current work around is to perform a search and replace on the entire string, then again search and replace between braces to undo that portion of the last replace.

Here is an example of the functionality I'm looking for:

import re
>>> str = 'I have a _cat, here is a pic {cat_pic}. Another_pic {cat_figure}'
>>> re.sub(regex1,'/_',str)
'I have a /_cat, here is a pic {cat_pic}. Another/_pic {cat_figure}'

The current solution I am using is a two-step process as follows:

import re
>>> str = 'I have a _cat, here is a pic {cat_pic}. Another_pic {cat_figure}'
>>> s1 = re.sub('_','/_',str)
>>> s1
'I have a /_cat, here is a pic {cat/_pic}. Another/_pic {cat/_figure}'
>>> s2 = re.sub(r'\{(.+?)/_(.+?)\}', r'{\1_\2}', s1)
>>> s2
'I have a /_cat, here is a pic {cat_pic}. Another/_pic {cat_figure}'

Is there a way using regex to do this is one statement, or is the current two-step process the cleanest method?

Thanks

0235ev
  • 197
  • 1
  • 8
  • If a answer helped you and worked, please check the checkmark beside the answer to accept, and you should do the same for your other questions. – hwnd May 21 '14 at 21:02

3 Answers3

5

Assuming all braces are balanced, you can try with this Lookahead combination.

>>> re.sub(r'(?=_)(?![^{]*\})', '/', str)

Explanation:

(?=       look ahead to see if there is:
  _       '_'
)         end of look-ahead
(?!       look ahead to see if there is not:
 [^{]*    any character except: '{' (0 or more times)
 \}       '}'
)         end of look-ahead

regex101 demo

hwnd
  • 69,796
  • 4
  • 95
  • 132
0

Here's another solution without lookaheads:

re.sub(r'\{.*?\}|_', lambda x: '/_' if x.group(0) == '_' else x.group(0), str)
Daniel
  • 42,087
  • 4
  • 55
  • 81
-1

Alright, this particular one-step solution is straight out of Match (or replace) a pattern except in situations s1, s2, s3 etc

Here's a simple regex that we will use to replace the correct underscores:

{[^}]*}|(_)

The expression on the left of the OR (i.e., |) matches complete {braced strings}. We will ignore these matches. The right side matches and captures underscores to Group 1, and we know they are the right underscores because they were not matched by the expression on the left.

This program shows how to use the regex (see the results at the bottom of the online demo).

import re
subject = 'I have a _cat, here is a pic {cat_pic}. Another_pic {cat_figure}'
regex = re.compile(r'{[^}]*}|(_)')
def myreplacement(m):
    if m.group(1):
        return ""
    else:
        return m.group(0)
replaced = regex.sub(myreplacement, subject)
print(replaced)

Reference

How to match pattern except in situations s1, s2, s3

Community
  • 1
  • 1
zx81
  • 41,100
  • 9
  • 89
  • 105