1

Let's say I have the following string: 'streets are shiny.' I wish to find every occurrence of the string 'st' and replace it with 'ts'. So the result should read 'tseets are shiny'.

I know this can be done using re.sub() or str.replace(). However, say I have the following strings:

  1. 'st'
  2. 'sts'
  3. 'stst'

I want them to change to 'ts','tss' and 'ttss' respectively, as I want all occurrences of 'st' to change to 'ts'.

What is the best way to replace these strings with optimal runtime? I know I could continually perform a check to see if "st" in string until this returns False, but is there a better way?

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
jonhurlock
  • 1,798
  • 1
  • 18
  • 28

5 Answers5

4

I think that a while loop that just checks if the 'st' is in the string is best in this case:

def recursive_replace(s, sub, new):
    while sub in s:
        s = s.replace(sub, new)
    return s

tests = ['st', 'sts', 'stst']
print [recursive_replace(test, 'st', 'ts') for test in tests]
#OUT:  ['ts', 'tss', 'ttss']
pzp
  • 6,249
  • 1
  • 26
  • 38
2

While the looping solutions are probably the simplest, you can actually write a re.sub call with a custom function to do all the transformations at once.

The key insight for this is that your rule (changing st to ts) will end up moving all ss in a block of mixed ss and ts to the right of all the ts. We can simply count the ss and ts and make an appropriate replacement:

def sub_func(match):
    text = match.group(1)
    return "t"*text.count("t") + "s"*text.count("s")

re.sub(r'(s[st]*t)', sub_func, text)
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • If the OP intends to call it many times, it will make sense to use [``re.compile``](https://docs.python.org/2/library/re.html#re.compile). That'll make it faster in the benchmark. @Blckknght: Perhaps you should add another solution to your answer using ``re.compile()``. – pzp May 06 '15 at 00:14
0

You can do that with a pretty simple while loop:

s="stst"
while('st' in s):
  s = s.replace("st", "ts")
print(s)

ttss

Ryan
  • 2,058
  • 1
  • 15
  • 29
0

If you want to continually check, then the other questions work well (with the problem that if you have something like stt you would get stt->tst->tts). I don't know if want that.

I think however, that you are trying to replace multiple occurences of st with ts. If that is the case, you should definitely use string.replace. .replace replaces every occurrence of a str, up to the extent you want.

This should be faster according to this.

string.replace(s, old, new[, maxreplace])

example:

>>>import string
>>>st='streets are shiny.streets are shiny.streets are shiny.'
>>>string.replace(st,'st','ts')
#out: 'tsreets are shiny.tsreets are shiny.tsreets are shiny.'
IronManMark20
  • 1,298
  • 12
  • 28
0

Naively you could do:

>>> ['t'*s.count('t')+'s'*s.count('s') for s in ['st', 'sts', 'stst']]
['ts', 'tss', 'ttss']
dawg
  • 98,345
  • 23
  • 131
  • 206