Most Efficient Way to Replace Multiple Characters in a String

Question

Let's say there is a string of any length, and it only contains the letters A through D:

s1 = 'ACDCADBCDBABDCBDAACDCADCDAB'

What is the most efficient/fastest way to replace every 'B' with an 'C' and every 'C' with a 'B'.

Heres what I am doing now:

replacedString = ''
for i in s1:
    if i == 'B':
        replacedString += 'C'
    elif i == 'C':
        replacedString += 'B'
    else:
        replacedString += i

This works but it is obviously not very elegant. The probelm is that I am dealing with strings that can be ones of milliions of characters long, so I need a better solution.

I can't think of a way to do this with the .replace() method. This suggests that maybe a regular expression is the way to go. Is that applicable here as well? If so what is a suitable regular expression? Is there an even faster way?

Thank you.

The 'duplicate' question seems to address removal of characters not replacement. — Malonge, Feb 27 '15 at 22:02
Yes, I was about to tell you about string translation before Cyber marked as duplicate, but basically, you don't want to use a dictionary because you will replace already replaced values. — Malik Brahimi, Feb 27 '15 at 22:07
If the post linked as duplicate doesn't help you, see this: http://www.tutorialspoint.com/python/string_translate.htm — Fred Larson, Feb 27 '15 at 22:11
"efficient" can mean different things to different people. Do you only want to iterate once? If you can afford to iterate multiple times use `str.replace` otherwise use translate. — notorious.no, Feb 27 '15 at 22:17
One iteration would be ideal as the iteration will have up millions of iterations. — Malonge, Feb 27 '15 at 22:18
Noooooooooooo! Do not use replace! I will post an response explaining this effect. — Malik Brahimi, Feb 27 '15 at 22:18
Don't know why I can't edit my comment, but @MalikBrahimi is right. Don't use replace — notorious.no, Feb 27 '15 at 22:26
Everyone, please see my response below as to why you shouldn't use replacement. Be sure to check out my concatenation method. — Malik Brahimi, Feb 27 '15 at 22:39
I find an answer is complete and useful. see this: https://stackoverflow.com/questions/3411771/multiple-character-replace-with-python#27086669 — Ali Hesari, Aug 25 '17 at 06:23

Malik Brahimi · Answer 1 · 2015-02-27T22:36:43.820

I wanted to show you the effects of improper translation. Let's pretend we had a DNA sequence like the string and we want to translate to RNA string. One method uses incorrect replacement whereas the other uses string concatenation.

string = 'GGGCCCGCGCCCGGG' # DNA string ready for transcription

Replacement

The problem with replacement is that the already replaced letters will be replaced in a future iteration. For example, you can see that once it is finished that you'll have a string of the same letter rather than a complete inversion.

string = 'GGGCCCGCGCCCGGG'

coding = {'A': 'U', 'T': 'A',
          'G': 'C', 'C': 'G'}

for k, v in coding.items():
    string = string.replace(k, v)

print string

Concatenation

Instead use string concatenation with a different string. As a result, you can retain the original string without replacing incorrectly. You can of course use a string translation, but I tend to prefer dictionaries because by definition, they map values.

string = 'GGGCCCGCGCCCGGG'

coding = {'A': 'U', 'T': 'A',
          'G': 'C', 'C': 'G'}

answer = ''

for char in string:
    answer += coding[char]

print answer

IMO toss this in a [gist](https://gist.github.com/) and put it as a comment. Agreed that this is Not An Answer — Adam Smith, Feb 27 '15 at 22:29

score 2 · Accepted Answer · answered Feb 27 '15 at 22:23

Apart from the str.translate method, you could simply build a translation dict and run it yourself.

s1 = 'ACDCADBCDBABDCBDAACDCADCDAB'

def str_translate_method(s1):
    try:
        translationdict = str.maketrans("BC","CB")
    except AttributeError: # python2
        import string
        translationdict = string.maketrans("BC","CB")
    result = s1.translate(translationdict)
    return result

def dict_method(s1):
    from, to = "BC", "CB"
    translationdict = dict(zip(from, to))
    result = ' '.join([translationdict.get(c, c) for c in s1])
    return result

This is nearly identical to my answer [here](http://stackoverflow.com/a/23332621/3058609). The question is possibly a dupe, but since this question has overlapping translations (`B <--> C`) it seems different enough to answer. — Adam Smith, Feb 27 '15 at 22:27

score 0 · Answer 3 · answered Feb 27 '15 at 23:00

Using regular expression, this handles the case sensitivity as well e.g. if alphabet which has to be replace in string is in lowercase then it will replace it with lowercase replacement character else uppercase:

import re

chars_map = {'b': 'c', 'c': 'b'} # build a dictionary of replacement characters in lowercase

def rep(match):
    char = match.group(0)
    replacement = chars_map[char.lower()]
    return replacement if char.islower() else replacement.upper()

s = 'AbC'
print re.sub('(?i)%s' % '|'.join(chars_map.keys()), rep, s) # 'AcB'

Most Efficient Way to Replace Multiple Characters in a String

3 Answers3

Replacement

Concatenation

Linked

Related