finding non-unique elements in list not working

Question

I wanted to find the non-unique elements in the list, but I am not able to figure out why this is not happening in the below code section.

>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> for i in d:
...     if d.count(i) == 1:
...             d.remove(i)
... 
>>> d
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b', 6, 3]

6 and 3 should have been removed. where as, if I use

d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c']

I am getting correct answer. Please explain what is happening, I am confused !!!

I am using python 2.7.5.

score 26 · Accepted Answer · edited Sep 11 '17 at 13:58

26

Removing elements of a list while iterating over it is never a good idea. The appropriate way to do this would be to use a collections.Counter with a list comprehension:

>>> from collections import Counter
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6, 'f', 3]
>>> # Use items() instead of iteritems() in Python 3
>>> [k for (k,v) in Counter(d).iteritems() if v > 1]
['a', 1, 2, 'b', 4]

If you want keep the duplicate elements in the order in which they appear in your list:

>>> keep = {k for (k,v) in Counter(d).iteritems() if v > 1}
>>> [x for x in d if x in keep]
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b']

I'll try to explain why your approach doesn't work. To understand why some elements aren't removed as they should be, imagine that we want to remove all bs from the list [a, b, b, c] while looping over it. It'll look something like this:

+-----------------------+
|  a  |  b  |  b  |  c  |
+-----------------------+
   ^ (first iteration)

+-----------------------+
|  a  |  b  |  b  |  c  |
+-----------------------+
         ^ (next iteration: we found a 'b' -- remove it)

+-----------------------+
|  a  |     |  b  |  c  |
+-----------------------+
         ^ (removed b)

+-----------------+
|  a  |  b  |  c  |
+-----------------+
         ^ (shift subsequent elements down to fill vacancy)

+-----------------+
|  a  |  b  |  c  |
+-----------------+
               ^ (next iteration)

Notice that we skipped the second b! Once we removed the first b, elements were shifted down and our for-loop consequently failed to touch every element of the list. The same thing happens in your code.

edited Sep 11 '17 at 13:58

The Unfun Cat

29,987
31
114
156

answered Sep 25 '13 at 13:25

arshajii

127,459
24
238
287

thanks. Using Counter it is working but why that piece of code isn't working. I want to know what is the problem in it ?! – Tanmaya Meher Sep 25 '13 at 13:28
by the way I want the answer to be [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b'] not ['a',1,2,'b',4]. if I use the elements() method in Counter in your code for getting this answer will make the list sorted which i do not want !! – Tanmaya Meher Sep 25 '13 at 13:31
1

@tanmay See the edit (by the way was 5 a typo? why should 5 be in that list?). – arshajii Sep 25 '13 at 13:39
Thanks for the explanation, at least I knew now what is happening !!:) – Tanmaya Meher Sep 25 '13 at 13:39
sorry 5 will not be in the list !!! my copying mistake !! The answer should be [1,2,1,2,4,4,'a','b','a','b'], i.e., all repetitive or non-unique elements including their repetitions and order maintained – Tanmaya Meher Sep 25 '13 at 13:41
4

Using Python 3.5.3 the suggested solution returns an error: `AttributeError: 'Counter' object has no attribute 'iteritems'`. Instead of `iteritems()` I use `items()` which works fine for me. – gplssm Aug 29 '17 at 16:33

score 4 · Answer 2 · edited May 23 '17 at 12:16

4

Better use collections.Counter():

>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> from collections import Counter
>>> [k for k, v in Counter(d).iteritems() if v > 1]
['a', 1, 2, 'b', 4]

Also see relevant thread:

How to find duplicate elements in array using for loop in Python?

edited May 23 '17 at 12:16

Community

1
1

answered Sep 25 '13 at 13:25

alecxe

462,703
120
1,088
1,195

score 3 · Answer 3 · answered Sep 25 '13 at 13:37

3

I just thought I would add my method with set comprehension if anyone was interested.

>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> d = list({x for x in d if d.count(x) > 1})
>>> print d
['a', 1, 2, 'b', 4]

Python 2.7 and up I believe for the set comprehension functionality.

answered Sep 25 '13 at 13:37

Shashank

13,713
5
37
63

please **don't use this method**, it has `O(n^2)` complexity – diralik May 31 '19 at 22:00

score 2 · Answer 4 · answered Sep 25 '13 at 14:27

Thanks for all the answers and comments !

Thought for a while and got another answer in my previous way I have written the code. So, I am posting it.

d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
e = d[:] # just a bit of trick/spice
>>> for i in d:
...     if d.count(i) == 1:
...             e.remove(i)
... 
>>> e
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b']

@arshajii, Your explanation led me to this trick. Thanks !

score 1 · Answer 5 · answered Sep 21 '16 at 06:56

You can also do like this :

data=[1,2,3,4,1,2,3,1,2,1,5,6]
    first_list=[]
    second_list=[]
    for i in data:
        if data.count(i)==1:
            first_list.append(i)
        else:
            second_list.append(i)
            print (second_list)

Result

[1, 2, 3, 1, 2, 3, 1, 2, 1]

score 0 · Answer 6 · answered Jan 14 '21 at 14:20

For

>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]

Using conversion to a set yields the unique items:

>>> d_unique = list(set(d))

Non-unique items can be found using a list comprehension

>>> [item for item in d_unique if d.count(item) >1]
[1, 2, 4, 'a', 'b']

score 0 · Answer 7 · answered Aug 11 '22 at 11:43

In python3 , use dict.items() instead of dict.iteritems()

iteritems() was removed in python3, so you can't use this method anymore.

    >>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
    >>> from collections import Counter
    >>> [k for k, v in Counter(d).items() if v > 1]
    ['a', 1, 2, 'b', 4]

finding non-unique elements in list not working

7 Answers7

Result

Linked

Related