0

There is this issue I have been thinking for some time. I have replacement rules for some string transformation job. I am learning regex and slowly finding correct patterns, this is no problem. However, there are many rules in this and I could not do them in a single expression. And now the processes are overlapping. Let me give you a simple example.

Imagine I want to replace every 'a' with 'o' in a string. I also want to replace every 'o' to 'k' in the same string, however, there is no order, so if I apply the previous rule first, then the converted 'a's now will become 'k', which simply is not my intention. Because all convertions must have the same priority or precedence. How can I overcome this issue ?

I use re.sub(), but I think same issue exists for string.replace() method. All help appreciated, Thank you !

Rockybilly
  • 2,938
  • 1
  • 13
  • 38
  • 2
    I believe this is what you are looking for: http://stackoverflow.com/questions/2400504/easiest-way-to-replace-a-string-using-a-dictionary-of-replacements – Gustavo Bezerra Mar 10 '16 at 01:49

4 Answers4

1

Don't use str.replace(), use str.translate().

Here is how to do it with Python 2:

from string import maketrans

s = 'aoaoaoaoa'
trans_table = maketrans('ao', 'ok')
print s.translate(trans_table)

Output

okokokoko

It's a little different for Python 3:

s = 'aoaoaoaoa'
trans_table = {ord(k):v for k,v in zip('ao', 'ok')}
print(s.translate(trans_table))
mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Thank you, but I need the regex because the rules I am dealing with are quite complex. The example was just to express the problem. – Rockybilly Mar 10 '16 at 01:53
  • OK, could you post a representative example? Are you just replacing characters, or substrings? How would you decide which replacement to apply if more than one pattern matches? – mhawke Mar 10 '16 at 02:15
  • More than one pattern never matches in my case. And for an example, you could say that I have different replacement rules for 'a' in the beginnig of a word or at the end or a single 'a'. Same goes for some of the other letters. That's why I need regex rather than string.replace – Rockybilly Mar 10 '16 at 02:18
0

I have had a similar challenge and ended up by replacing the first character with a place holder. I then replaced the 2nd character. The third pass was to replace the place holder with the desired character. Not fancy but worked every time.

Replace the 'a' with '$', replace the 'o' with 'k', replace the '$' with 'o'.

0

We can solve it by the following code:

a --> ao; o --> k (a --> ao --> ak); ak --> o

string_test =  "aaaoakkokkooao"
print (string_test.replace("a", "ao").replace("o", "k").replace("ak", "o"))
Yunhe
  • 665
  • 5
  • 10
0

Try this (works for python2 and python3)

RULES = { 'a': 'o', 'o':'k'}  # a->o, o->k, ... no precedence
source = 'Hello I am ok'
dest = "".join(RULES.get(c,c) for c in source)
print (dest)

You can easily add rules.

It also works if there are "loops" (example, add 'k':'a' would make loop a -> o -> k -> a ).

The big problem is that it doesn't use regular expressions (as your OP asks for). It could if your regular expressions were all for exactly one character, and were all mutually exclusive. If it is the case, then you would not really need regular expressions (my above solution would be enough). What do you do if two regular expressions match (different lengths)? Which one do you use (since you do not want any priorities)?

Sci Prog
  • 2,651
  • 1
  • 10
  • 18