How can I remove all repeated characters from a string?
e.g:
Input: string = 'Hello'
Output: 'Heo'
different question from Removing duplicate characters from a string as i don't want to print out the duplicates but i want to delete them.
How can I remove all repeated characters from a string?
e.g:
Input: string = 'Hello'
Output: 'Heo'
different question from Removing duplicate characters from a string as i don't want to print out the duplicates but i want to delete them.
You can use a generator
expression and join
like,
>>> x = 'Hello'
>>> ''.join(c for c in x if x.count(c) == 1)
'Heo'
You could construct a Counter
from the string, and retrieve elements from it looking up in the counter which appear only once:
from collections import Counter
c = Counter(string)
''.join([i for i in string if c[i]==1])
# 'Heo'
a = 'Hello'
list_a = list(a)
output = []
for i in list_a:
if list_a.count(i) == 1:
output.append(i)
''.join(output)
In addition to the other answers, a filter
is also possible:
s = 'Hello'
result = ''.join(filter(lambda c: s.count(c) == 1, s))
# result - Heo
If you limit your question to cases with only repeated consecutive letters (as your example suggests), you could employ regular expressions:
import re
print(re.sub(r"(.)\1+", "", "hello")) # result = heo
print(re.sub(r"(.)\1+", "", "helloo")) # result = he
print(re.sub(r"(.)\1+", "", "hellooo")) # result = he
print(re.sub(r"(.)\1+", "", "sports")) # result = sports
If you need to re-apply the regular expression many times, its worth to compile it beforehand:
prog = re.compile(r"(.)\1+")
print(prog.sub("", "hello"))
To restrict the search for duplicated letters on some subset of characters, you can adjust the regular expression accordingly.
print(re.sub(r"(\S)\1+", "", "hello")) # Search duplicated non-whitespace chars
print(re.sub(r"([a-z])\1+", "", "hello")) # Search for duplicated lowercase letters
Alternatively, an approach using list comprehension could look as follows:
from itertools import groupby
dedup = lambda s: "".join([i for i, g in groupby(s) if len(list(g))==1])
print(dedup("hello")) # result = heo
print(dedup("helloo")) # result = he
print(dedup("hellooo")) # result = he
print(dedup("sports")) # result = sports
Note that the first method using regular expressions was on my machine about 8-10 times faster than the second one. (System: python 3.6.7, MacBook Pro (Mid 2015))