How to delete all the consecutive characters in a string and print the remaining string?

Question

for a string 'mississipie' remove all the consecutive repeating characters and print the remaining. sample input: 'mississipie' sample output: 'mpie'

It sounds like a homework question. You need to create a function that removes duplicate characters ('ss') and then calls itself recursively to remove the resulting ('iii'). In case some other input may have more duplicated characters, you also need to add and if condition to stop recursion when length of string = 1 and return the final string. — NotAName, Jul 23 '20 at 08:56
Did you tried yourself? [How do I ask and answer homework questions?](https://meta.stackoverflow.com/q/334822). We'll be happy to help you with your code in there is any. — Adrien Kaczmarek, Jul 23 '20 at 09:00
Does this answer your question? [How to remove duplicates only if consecutive in a string?](https://stackoverflow.com/questions/11460855/how-to-remove-duplicates-only-if-consecutive-in-a-string) — metatoaster, Jul 23 '20 at 09:00
i tried counting all the element and deleting them but didn't get the output, so i kept modifying it and its fully messed up — Purushothaman U, Jul 23 '20 at 09:02

score 4 · Accepted Answer · answered Jul 23 '20 at 09:02

4

A recursive version with itertools.groupby:

from itertools import groupby

s = 'mississipie'

def remove(s):
    out = ''
    for _, g in groupby(s):
        tmp = ''.join(g)
        if len(tmp) == 1:
            out += tmp
    if out == s:
        return out
    return remove(out)

print(remove(s))

Prints:

mpie

answered Jul 23 '20 at 09:02

Andrej Kesely

168,389
15
48
91

can you explain '_,' in the for loop? what is it used for – Purushothaman U Jul 23 '20 at 09:08
2

If you are not at all interested in the iterator in a loop, you may give it the name `_`. It's a pythonic thing to do. In this case, you are not interested in the first element of a groupby-tuple (the element that was grouped) but only in the second element, which is the group of elements itself. Since the first element is irrelevant, it was named `_`. – Martin Wettstein Jul 23 '20 at 09:18

Tim Biegeleisen · Answer 2 · 2020-07-23T09:05:56.243

3

Using re.sub with a while loop, we can try successively removing clusters of two or more repeating characters from the input. We will iterate doing this, until no more replacements have been made. This is how we know when to stop replacing.

inp = "mississipie"
length = len(inp)
while True:
    inp = re.sub(r'(.)\1+', '', inp)
    if len(inp) == length:
        break
    length = len(inp)

print("final output: " + inp)

This prints:

final output: mpie

Here are the steps of replacement:

mississipie
miiipie      (remove 'ss', twice)
mpie         (remove 'iii' cluster, once)

edited Jul 23 '20 at 09:05

answered Jul 23 '20 at 09:00

Tim Biegeleisen

502,043
27
286
360

can you please explain the line "inp = re.sub(r'(.)\1+', '', inp)". – Purushothaman U Jul 23 '20 at 09:22
1

The regex pattern `(.)\1+` matches any single character which in turn is followed by that same character one or more times. Then, this entire text is replaced by empty string, to effectively remove it from the input. – Tim Biegeleisen Jul 23 '20 at 09:24

How to delete all the consecutive characters in a string and print the remaining string?

2 Answers2