7

Given a string, I need to replace a substring with another in an area not located between two given words.

For example:

substring: "ate" replace to "drank", 1st word - "wolf", 2nd word - "chicken"

input:  The wolf ate the chicken and ate the rooster
output: The wolf ate the chicken and drank the rooster

Currently, the only solution I have is extremely unclean:

1) Replace the string located between the two words to a temporary substring, via Replace a string located between

2) replace the string I originally wanted

3) revert the temporary string to the original string

Edit:

I specifically asked a slightly different question than my case to keep the answer relevant for future readers.

My specific need is splitting a string according to ":", when I need to disregard ":" that are between "<" and ">" brackets that can be chained, where the only promise is that the number of opening brackets equal the number of closing brackets.

So for example, In the following case:

input  a : <<a : b> c> : <a < a < b : b> : b> : b> : a
output [a, <<a : b> c>, <a < a < b : b> : b> : b>, a]

If the answers are very different, I'll start another question.

Community
  • 1
  • 1
ErezO
  • 364
  • 2
  • 10
  • wolf: `{`, chicken: `}`, ate:`a`. Are any of these possible: `"a { a a } a"`, `"a {a} a {a} a"`, `"{a {a} }"`, `"{a} a a"`? Can you edit the question to explain some more cases? – Kobi Apr 16 '15 at 12:58
  • yes, especially {a {a} }, in which case none of these "a" should be changed. – ErezO Apr 16 '15 at 13:01
  • In Python, are you using `re` or `regex`? Have you considered a non-regex solution? – Kobi Apr 16 '15 at 13:05
  • re, python 2.7, but same applies for 3.4 – ErezO Apr 16 '15 at 13:09
  • With all the cases in my comment (and more), I'd take a risk and say you cannot do it with a Python `re` regex. With the `regex` module you have recursion (IIRC), but I'm not sure you want to go there either. Write a loop, count `{` and `}`, and replace when `count` is `0`. – Kobi Apr 16 '15 at 13:19
  • Also, please edit the question: the example is confusing, and you should mention more interesting cases. – Kobi Apr 16 '15 at 13:20
  • Are you certain that the constraint you have come up with ("occurrence of word not between two other words") is necessarily the best one? Perhaps another constraint might lead to a more tenable solution - in the quoted example, "last occurrence of a word in a line" would be one such alternative, but I don't know if that's suitable for your real use case... Sometimes redefining the problem at hand can get you out of what seems like a difficult solution... – twalberg Apr 16 '15 at 14:31
  • I specifically asked a slightly different question than my case to keep the answer relevant for future readers. My specific need is splitting a string according to ":", when I need to disregard ":" that are between "<" and ">" brackets that can be chained, where the only promise is that the number of opening brackets equal the number of closing brackets. – ErezO Apr 16 '15 at 16:28

2 Answers2

3
def repl(match):
    if match.group()=="ate":
        return "drank"
    return  match.group()


x="The wolf ate the chicken and ate the rooster"
print re.sub(r"(wolf.*chicken)|\bate\b",repl,x)

You can use a function for replacement to do the trick with re.sub

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
vks
  • 67,027
  • 10
  • 91
  • 124
1

Use re.sub one-liner function.

>>> s = "The wolf ate the chicken and ate the rooster"
>>> re.sub(r'wolf.*?chicken|\bate\b', lambda m: "drank" if m.group()=="ate" else m.group(), s)
'The wolf ate the chicken and drank the rooster'

Update:

Updated problem would be solved by using regex module.

>>> s = "a : <<a : b> c> : <a < a < b : b> : b> : b> : a"
>>> [i for i in regex.split(r'(<(?:(?R)|[^<>])*>)|\s*:\s*', s) if i]
['a', '<<a : b> c>', '<a < a < b : b> : b> : b>', 'a']

DEMO

Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • The DEMO link isn't working properly, the attached python example works perfectly. – ErezO Apr 19 '15 at 07:09
  • yep, it shows only the captured text. Added just to show how nested `<>` are captured. – Avinash Raj Apr 19 '15 at 07:10
  • I did find a problem: a< b > (no ":" )is being split into ['a', '< b >']. I don't want to pre-optimize, but I have no idea what's the performance relative to a tailored non-regex solution. – ErezO Apr 19 '15 at 07:47
  • if you have any further problems, please ask it as a new question along with the sample input and expected output. – Avinash Raj Apr 19 '15 at 07:49
  • asked a follow up question - http://stackoverflow.com/questions/29727339/python-regex-splitting-according-to-criteria – ErezO Apr 19 '15 at 08:02