1

I'm trying to write my codes to delete duplicates in a list. Here is first one:

a = [2,2,1,1,1,2,3,6,6]
b = [2,2,1,1,1,2,3,6,6]
for x in a:
  a.remove(x)
  if x in a:
     b.remove(x)
print('the list without duplicates is: ', b) 

Unfortunately, it produces this result:

the list without duplicates is: [2, 1, 2, 3, 6] 

Then I try to write the second one:

a = [2,2,1,1,1,2,3,6,6]
b = [2,2,1,1,1,2,3,6,6]
for i in range(len(a)): 
  for x in a:
    a.remove(x)
    if x in a:
        b.remove(x)    
print('the list without duplicates is: ', b)

And this second one produces the result as I expected:

the list without duplicates is:  [1, 2, 3, 6]

I really dont see why the second one is different from the first one. In fact, if I apply for the list:

[2,2,1,1,2,3,6,6]

Both of them produces the same result:

the list without duplicates is:  [1, 2, 3, 6]
Axlp1210
  • 193
  • 5
  • 2
    Is there a reason you aren't using `set` – SuperStew Jan 12 '18 at 19:19
  • yeah, I want to try on my own before rely on the built-in. I think it's good for newbie. – Axlp1210 Jan 12 '18 at 19:21
  • 3
    Just *Google* "python delete element while iterating". – CristiFati Jan 12 '18 at 19:21
  • 2
    Yea you could just google it, but generally, removing elements from a list while you are iterating over it is bad practice. Though sometimes I do this anyway... But your question is just why the second one worked? – SuperStew Jan 12 '18 at 19:23
  • yeah, for me both of them are just the same right ? I cannot see any difference. In fact, when I execute my first one by pen and paper I do exactly like the second one. – Axlp1210 Jan 12 '18 at 19:28

2 Answers2

1

why the second one is different

The for loop keeps track of the index it is on. When you remove an item from the list, it messes up the count: the item that was at index i+1 has now become the item at index i and gets skipped in the next iteration.

To illustrate:

a_list = ['a','b','c','d','e','f','g','h']
for i, item in enumerate(a_list):
    print(f"i:{i}, item:{item}, a_list[i+1]:{a_list[i+1]}")
    print(f"\tremoving {item}", end = ' ---> ')
    a_list.remove(item)
    print(f"a_list[i+1]:{a_list[i+1]}")
>>>
i:0, item:a, a_list[i+1]:b
    removing a ---> a_list[i+1]:c
i:1, item:c, a_list[i+1]:d
    removing c ---> a_list[i+1]:e
i:2, item:e, a_list[i+1]:f
    removing e ---> a_list[i+1]:g
i:3, item:g, a_list[i+1]:h
....

Your second solution works because even though you are skipping items in the for loop, you have added an outer loop that revisits the process and operates on those skipped items. In the last iteration, because you are skipping items, there is always one of the duplicate values left.

wwii
  • 23,232
  • 7
  • 37
  • 77
0

If you really don't want to use the built-in set, it is way easier to create a new list and add new elements to it if they aren't duplicate.

Something like:

a = [2,2,1,1,1,2,3,6,6]
c = []
for item in a:
    if item not in c:
        c.append(item)
erickrf
  • 2,069
  • 5
  • 21
  • 44