1

I have the folowing expression:

a = 'x11 + x111 + x1111 + x1'

and I would like to replace the following:

from_ = ['1', '11', '111', '1111']
to = ['2', '22', '333', '3333']

and therefore obtain the following result:

anew = 'x22 + x333 + x3333 + x2'

How can I do this using Python?

This is a similar question to: Python replace multiple strings. However in my case the replaced values are being overwiten by themselves if I use the suggested anwsers in the question. Hence, in the metioned link the result is 'x22 + x222 + x2222 + x2'

Community
  • 1
  • 1
blaz
  • 4,108
  • 7
  • 29
  • 54

1 Answers1

2

re.sub from the re library (regex) can be used whenever you need to do multi-value replacements.

re.sub takes in the additional argument of a function, in that function you can make the necessary change. From the documentation

re.sub(pattern, repl, string, count=0, flags=0)

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

(emphasis mine)

The regex here is simple, i.e, \d+ which implies that you are matching all the groups of digits.

You can utilize the following code snippet to get your desired output

import re

a = 'x11 + x111 + x1111 + x1'

def substitute(matched_obj):
    from_ = ['1', '11', '111', '1111']
    to = ['2', '22', '333', '3333']
    part = matched_obj.group(0)
    if part in from_:
        return to[from_.index(part)]
    return part

anew = re.sub(r'\d+',substitute,a)

After executing the program the value of anew will be x22 + x333 + x3333 + x2 which is the expected answer. `

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
  • No problem. There is a failure mode you haven't taken into account that I want to mention, consider the input `a = 'x12'`, the output should be `'x22'` but because of greedy regex you get output `'x12'` – wim Apr 12 '16 at 21:36
  • @wim Thanks for mentioning that. I'll try to get that correct as soon as possible. (Within 24 hrs, can't do it ATM as I'm already on the bed 315AM). Thanks again. – Bhargav Rao Apr 12 '16 at 21:41
  • @wim I re-read the question. I think that the OP has a string with `+` separated values and each of the value there is a `x` followed by any of the values present in his `from_` list. So the change will be made only if there is `12` in the `from_` list. Hence, I think greedy here is quite good enough. Please do inform me if there is a change required, I'll try to modify the answer. – Bhargav Rao Apr 13 '16 at 16:49
  • OK I read the problem differently, and my feeling is that with input `'abc'` and `from, to = ['ab', 'bc'], ['xy', 'yz']` the correct output should be `'xyc'`. I don't know if this is right or not because the OP example doesn't give enough information here. – wim Apr 13 '16 at 18:42