remove non repeating characters from a list

Question

I am trying to remove non repeating characters from a list in python. e.g list = [1,1,2,3,3,3,5,6] should return [1,1,3,3]. My initial attempt was:

def tester(data):
    for x in data:
        if data.count(x) == 1:
            data.remove(x)
    return data

This will work for some inputs, but for [1,2,3,4,5], for example, it returns [2,4]. Could someone please explain why this occurs?

also, your approach is O(n^2) which is probably not what you want — ThiefMaster, Jun 17 '14 at 00:12
You should not change a list while you do loop through its elements. That is why you got [2,4] — VladimirM, Jun 17 '14 at 00:18
possible duplicate of [Remove items from a list while iterating in Python](http://stackoverflow.com/questions/1207406/remove-items-from-a-list-while-iterating-in-python) — Dan Lenski, Jun 17 '14 at 00:19

Padraic Cunningham · Accepted Answer · 2014-06-17T00:27:27.283

4

 l=[1,1,2,3,3,3,5,6]

 [x for x in l if l.count(x) > 1] 
 [1, 1, 3, 3, 3]

Adds elements that appear at least twice in your list.

In your own code you need to change the line for x in data to for x in data[:]:

Using data[:] you are iterating over a copy of original list.

edited Jun 17 '14 at 00:27

answered Jun 17 '14 at 00:17

Padraic Cunningham

176,452
29
245
321

2

I wasn't the downvoter, but perhaps it was for suggesting a quadratic time solution in place of a linear time one. If the list is always very short, the quadratic performance won't really matter, but in the general case it's very important – John La Rooy Jun 17 '14 at 01:52
@gnibbler,the question was not what is the most efficient way to remove non repeating elements from a list. It was about why the OP's code was not working, I provided two ways to get around it so I think downvoting is still bs. – Padraic Cunningham Jun 17 '14 at 08:42

score 4 · Answer 2 · answered Jun 17 '14 at 00:17

4

There is a linear time solution for that:

def tester(data):
    cnt = {}
    for e in data:
        cnt[e] = cnt.get(e, 0) + 1
    return [x for x in data if cnt[x] > 1]

answered Jun 17 '14 at 00:17

pkacprzak

5,537
1
17
37

score 3 · Answer 3 · answered Jun 17 '14 at 00:21

This is occurring because you are removing from a list as you're iterating through it. Instead, consider appending to a new list.

You could also use collections.Counter, if you're using 2.7 or greater:

[a for a, b in collections.Counter(your_list).items() if b > 1]

score 1 · Answer 4 · answered Jun 17 '14 at 01:50

1

Another linear solution.

>>> data = [1, 1, 2, 3, 3, 3, 5, 6]
>>> D = dict.fromkeys(data, 0)
>>> for item in data:
...     D[item] += 1
... 
>>> [item for item in data if D[item] > 1]
[1, 1, 3, 3, 3]

answered Jun 17 '14 at 01:50

John La Rooy

295,403
53
369
502

score 0 · Answer 5 · answered Jun 17 '14 at 00:19

You shouldn't remove items from a mutable list while iterating over that same list. The interpreter doesn't have any way to keep track of where it is in the list while you're doing this.

See this question for another example of the same problem, with many suggested alternative approaches.

score 0 · Answer 6 · answered Jun 17 '14 at 00:28

0

you can use the list comprehention,just like this:

def tester(data):
    return [x for x in data if data.count(x) != 1]

it is not recommended to remove item when iterating

answered Jun 17 '14 at 00:28

tiann

1

remove non repeating characters from a list

6 Answers6

Linked