Function to remove duplicates from a List | Python

Question

I am trying to write a function remove_duplicates to return only unique values from a list input. I tried to come up with some code but it is throwing infinite loop error. I am unable to understand why. The goal is not to achieve result as I have discovered there are direct methods like 'SET' to do this. But, I primarily wanted to understand my mistake as this is my first language and first day at any kind of coding.

def remove_duplicates(x):
    z = [x[0]]
    for i in range(1,len(x)):
        y = i-1
        k = 0
        while y >= 0:
            if x[i] == x[y]:
               k = k + 1 
               y -= 1
        else:
            break
        if k == 0:
            z.append(x[i])
    return z

If not `x[i] == x[y]:` you never decrease `y` and get stuck in the loop. — tobias_k, Mar 01 '16 at 19:07
Any particular reason you don't just use `in`, or a `set` to get rid of duplicates? — tobias_k, Mar 01 '16 at 19:08
Yes, I am trying to understand loops by writing functions from scratch. Concepts are more important to me right now rather than dummy results. Thanks for helping. — Aman Dhoot, Mar 01 '16 at 19:34

score 25 · Answer 1 · answered Mar 01 '16 at 19:10

25

Use the built-in python set capabilities.

y = list(set(x))

y will be a list of the unique elements of x. This works when the elements in x may be used in a set, so they have to implement __eq__() and __hash__().

answered Mar 01 '16 at 19:10

svohara

2,159
19
17

score 6 · Answer 2 · answered Mar 01 '16 at 19:11

6

It'll be good If you can use

SET operator

to remove the duplicate elements from the list, like this:

my_list = [1, 2, 3, 1, 1, 1, 1, 1, 2, 3, 4]

Now time to remove the duplicate elements from this list:

list(set(my_list))

Answer: [1, 2, 3, 4]

answered Mar 01 '16 at 19:11

Talat Parwez

129
4

And now try this to de-deplify the list `[4,3,2,1]`. Or a list of lists. – tobias_k Mar 01 '16 at 19:19

tobias_k · Answer 3 · 2016-03-01T19:26:07.693

The main problem with your code seem to be here:

while y >= 0:
    if x[i] == x[y]:
       k = k + 1 
       y -= 1

Here, you decrement y only if the current element was a match, otherwise you get into an infinite loop. Also, you have to remove the else: break, otherwise your add-loop will stop right after the first unique element in the list (i.e. after the first element)

If you want to stay true to your initial approach, you could try this:

def remove_duplicates(x):
    z = [x[0]]
    for i in range(1,len(x)):
        for y in range(0, i):
            if x[i] == x[y]:
                break
        else:
            z.append(x[i])
    return z

Note, however, that there are much simpler ways to ensure that the elements are unique. For instance, you can just use in to check whether the current element is already in the result list instead of checking each element individually.

def remove_duplicates(lst):
    res = []
    for x in lst:
        if x not in res:
            res.append(x)
    return res

If the elements are guaranteed to be hashable, you can also use a set. But don't do return list(set(lst)), as this will not preserve the order of the elements in the list. This is a bit more words, but faster than using x not in res.

def remove_duplicates(lst):
    seen = set()
    res = []
    for x in lst:
        if x not in seen:
            res.append(x)
            seen.add(x)
    return res

If you want a one-liner like this, you could use OrderedDict though:

import collections
def remove_duplicates(lst):
    return collections.OrderedDict(zip(lst, lst)).values()

Function to remove duplicates from a List | Python

3 Answers3

Linked