How to remove characters that appear more than once from a string?

Question

So, I had a similar exercise on my IT classes: 'Print a string without characters appearing more than once (if they appear more than once, remove them)'. I thought that it was easy (and maybe it is), but I have completely no idea how to do that. I can do similar exercises (print all unique characters from a string / remove duplicates etc).

Example:

Input: '12345555555678'

Output: '1234678'

Does [this](https://stackoverflow.com/a/9841328/11261546) answer you question? — Ivan, Mar 04 '20 at 10:15
If you can already remove duplicates (to print unique characters), then start with that code, show how you made some effort to adapt it to the new problem, and tell us what actual issues you encountered. — Useless, Mar 04 '20 at 12:22
@ivan with [that](https://stackoverflow.com/a/9841328/11261546) the result would be 1234**5**678 not 1234678 — Sorin, Mar 04 '20 at 12:31

Sorin · Answer 1 · 2020-03-04T12:20:44.517

basic algorithm for this is described in this answer- for each char you check if it appears more than once by counting it's occurrences in the string.

However that's fairly inefficient, since it goes trough the string n ^ 2. You can improve that with the expense of some memory (which is illustrated in this answer - but obfuscated by a library).

The algorithm would then be to go once trough the string and count the number of occurrences for each char and save them somewhere, then go again trough the string and print only the chars that have the count 1.

inp = '1345552225555678'

counts = {};

for ch in inp:
    if ch in counts:
        counts[ch] = counts[ch] + 1
    else:
        counts[ch] = 1

result = '';

for ch in inp:
    if counts[ch] == 1:
        result = result + ch

print result

Arguably, this would be O(n) since the access time for a dictionary is generally considered O(1) (see this question for a discussion)

Note: Usually this is done using an array the size of the number legal chars, but since strings in python are Unicode, an array would be huge, however the access time would be truly O(1);

score 1 · Answer 2 · edited Mar 04 '20 at 12:05

This should look like what you want

input_str = 'ahuadvzudnioqdazvyduazdazdui'
for c in input_str:
    if input_str.count(c)==1:
        print(c)

It's easier to understand, but note that it has quite low performance (Complexity of O(n^2)).

To make it little faster you can use List Comprehension.

input_str = '12345555555678'
[x for x in input_str if input_str.count(x) == 1]

If order of the element doesn't matter to you the iterating over set of the list will be beneficial.

If you convert list into set using set(input_str) then it will have unique values which may evantually reduce search space.

Then you can apply list complrehension.

input_str = '12345555555678'
[x for x in set(input_str) if input_str.count(x) == 1]

Note: Do not forget the condition that order will not be preserved after converting to set.

Tried it too, but it also prints a letter that appears more than once instead of deleting it. It's a similar answer, but not the same. With your code and input 'zzzzx1' i get output 'zx1', but not 'x1' — neoxxx, Mar 04 '20 at 10:23

Filip Młynarski · Accepted Answer · 2020-03-04T11:55:57.553

1

You could use collections.Counter().

from collections import Counter

inp = '12345555555678'
c = Counter(inp)
output = ''.join(k for k, v in c.items() if v == 1)  # -> 1234678

Simple implementation of Counter

c = {}
for char in inp:
    c[char] = c.get(char, 0) + 1

edited Mar 04 '20 at 11:55

answered Mar 04 '20 at 10:21

Filip Młynarski

3,534
1
10
22

I don't understand this code, but it works :D Thanks – neoxxx Mar 04 '20 at 10:27
Check out documentation I linked in my answer. It's well written, I believe you'll understand it :) – Filip Młynarski Mar 04 '20 at 10:37
1

since this is IT class exercise, you probably should not use libraries, especialy since this is trivial to implement without it. – Sorin Mar 04 '20 at 11:25
How would you do that without library? Also, libraries are allowed on my classes, but I would like to know. – neoxxx Mar 04 '20 at 11:32
@Sorin that was not specified in question but I updated my answer anyway. – Filip Młynarski Mar 04 '20 at 11:56
@neoxxx - allowed or not, libraries should be used sparingly when you are trying to learn basic stuff like this. – Sorin Mar 04 '20 at 12:10

Akhil Sharma · Answer 4 · 2020-03-04T14:55:51.287

i_str =  '12345555555678'
b = sorted(i_str)
for i in range(len(b)-1):
    if b[i] == b[i+1]:
        i_str = i_str.replace(b[i],'')

You just sort the string and compare each nth element with next element.If it is not same it is unique.

Also I am pretty sure it should be faster than using count function which will iterate though all the string for each unique element and check if the count of character is not greater than 1.

Aldas Žarnauskas · Answer 5 · 2022-07-11T11:48:12.283

0

I solved a similar task on the codeacademy. I was requested to define a function that removes all vowels, even if it repeats. My code that allows to remove repeating symbols is below:

def anti_vowel(text):
    all_vowels = ["A", "E", "U", "I", "O", "a", "e", "o", "u", "i"]
    listed_text = []
    for letter in text:
        listed_text.append(letter)
    for vowel in all_vowels:
        while vowel in listed_text:
            listed_text.remove(vowel)
    return "".join(listed_text)
    
print(anti_vowel("Hey look Words!"))

output:

Hy lk Wrds!

edited Jul 11 '22 at 11:48

answered Jul 11 '22 at 11:38

Aldas Žarnauskas

1
1

This answer does not address the question. – rachwa Jul 15 '22 at 09:36

How to remove characters that appear more than once from a string?

5 Answers5