0

could you please help me find the error in my logic? The function doesn't work well and I can't figure out why. The instructions are: Write a function remove_duplicates that takes in a list and removes elements of the list that are the same. Do not modify the list you take as input! Instead, return a new list. For example: remove_duplicates([1,1,2,2]) should return [1,2].

For [1, 2, 2, 5, 5, 5, 7, 7] I'm getting this output [1, 2, 5, 7], which is good. However, for [4, 9, 9, 4] the output is [9, 9, 4], which is wrong. I can't find out what the problem is. I started learning programming a few weeks ago, so I'm a novice. Thanks!

My code:

def remove_duplicates(l):
    nl = list(l)
    i = 0    
    while i <= len(nl)-2:
        j = i + 1
        while j <= len(nl)-1:
            if nl[i] == nl[j]:
                nl.remove(nl[j])
            else:
                j += 1
        i += 1
    return nl
Peter
  • 419
  • 4
  • 14

5 Answers5

2

You want to utilize a set. A set will remove duplicates

def remove_duplicates(l):
    return list(set(l))


l_1 = [1,1,2,2]
l_2 = remove_duplicates(l_1)

print l_1
print l_2

Outputs:

[1, 1, 2, 2]
[1, 2]

Alternatively, with your other list:

[4, 9, 9, 4]
[9, 4]

Notice that the function wraps set in a list, otherwise you would get a set back, instead of a new list.

Andy
  • 49,085
  • 60
  • 166
  • 233
2

in python we have set to remove duplication:

>>> a = [1, 2, 2, 5, 5, 5, 7, 7]
>>> set(a)
set([1, 2, 5, 7])

in your code if you backtrack:

0 i          # here i is 0 
1 j          # here j is 0
4 duplicate element   first duplicate element found at last that is 4, but removed from front
1 i           # now list is [9,9,4]  but i is 1 and j is 2
2 j
[9, 9, 4]

so there is not match for 9 and 9 , so it not been removed

so in your code if you put del(nl[j]) , it will work fine.

Hackaholic
  • 19,069
  • 5
  • 54
  • 72
  • This is not incorrect, but it should be noted that this returns a `set` not a `list`, like the questioner asked for. – Andy Dec 14 '14 at 15:12
0

You might look into deduplication via converting the list object into a set.

my_list = [4, 9, 9, 4]
deduped = list(set(my_list))
print deduped  # prints [9, 4]
rchang
  • 5,150
  • 1
  • 15
  • 25
0

Your main problem is the line

nl.remove(nl[j])

You want to remove the item at index j. What this line actually does is remove the first occurrence of the value contained at index j.

Instead, try

del nl[j]

Edit:

Let's trace your example, remove_duplicates([4,9,9,4]):

nl = [4, 9, 9, 4]
i = 0

j = 1
nl[0] != nl[1]

j = 2
nl[0] != nl[2]

j = 3
nl[0] == nl[3]

At this point, you want to get rid of nl[3] by calling

nl.remove(nl[3])

but see what happens:

>>> [4, 9, 9, 4].remove(4)      # you expect [4, 9, 9, {deleted}]
[9, 9, 4]                       # but get    [{deleted}, 9, 9, 4]

which causes further issues by shifting the array - i and j no longer point at the same items.

The cause is simple:

>>> help(list.remove)
L.remove(value) -> None -- remove first occurrence of value.
                                    ^^                 ^^

you are telling it to remove a value, not a location.

If instead you do

nl = [4, 9, 9, 4]
del nl[3]          # delete a *location*, not a *value*     
                   # gives [4, 9, 9, {deleted}]

you get the result you were expecting.

Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
0

For [4, 9, 9, 4] if you want [4, 9] to be the output, other answers that just use list(set(a)) will not preserve the order.

To preserve the order use:

def remove_duplicates(l):
    s = set()
    o = []
    for i in l:
        if i not in s:
            s.add(i)
            o.append(i)
    return o

See that:

>>> remove_duplicates([4, 9, 9, 4])
[4, 9]
Dan D.
  • 73,243
  • 15
  • 104
  • 123