0

Currently working through the springboard data science career track admissions test and one of the questions I got asked was to removes all on non-duplicates from a list of numbers entered via a one line of standard input separated by a space, and return a list of the the duplicates only.

def non_unique_numbers(line):
    for i in line:
        if line.count(i) < 2:
            line.remove(i)
    return line


lin = input('go on then')
line = lin.split()
print(non_unique_numbers(line))

The output is inconsistent it seems to remove every other non-duplicate at times but never removes all the non-duplicates, please can you let me know where I am going wrong.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
YungReezy
  • 23
  • 6
  • 1
    In general it's not safe to `remove` items from an iterable while looping on it. This creates, as you witness, inconsistent behaviors. Look at [this question](https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating) that is similar. modified from there: `res = [x for x in line if line.count(x) > 1]` – Tomerikoo Jul 09 '19 at 15:13
  • 1
    Also look into using a `set()` to keep track of what elements you have seen once and what elements you have seen more than once. – Error - Syntactical Remorse Jul 09 '19 at 15:14
  • 1
    Possible duplicate of [How to remove items from a list while iterating?](https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating) – Procrastinator Jul 09 '19 at 15:20

2 Answers2

0

What happens when doing for i in line is that every iteration i gets the value from an iterator created on the variable line. So by changing line you are not changing the iterator.

So, when removing an element at index, say j, all items in index i > j are moved one index down. So now your next item will be again in index j, but the loop will still continue and go to index j+1.

A good way to see this is running your function on an all-non-duplicate values:

>>> l = [0, 1, 2, 3, 4, 5]
>>> print(non_unique_numbers(l))
[1, 3, 5]

You can see that only even-indexed values were removed according to the logic described above.

What you want to do is work on a new, separate list to stack your results. For that you could use simple list comrehension:

lin = input('go on then')
line = lin.split()
print([x for x in line if line.count(x) > 1])
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
-2

It is not safe to modify a list while iterating through it. The actual problem, I think, is that remove() only removes the first instance of the value, which would make the < 2 check on the last element fail and not call the remove().

Better to use a hash table to find the counts and return those with < 2 then.

  • The question from the OP was what they were doing wrong. Would you, @tomerikoo, prefer that it be a complete solution? – Mochnant Jul 09 '19 at 15:21
  • That is not strictly true, since it depends on how you define "duplicate" and what the expected output is. Does the OP mean unique duplicates, or the "excess" duplicates. Until the OP clarifies, you cannot say that my "claim is not true." – Mochnant Jul 09 '19 at 15:26
  • a duplicate in this is any number inputted that appears more than once and the expected output form an input of say 1 2 2 3 4 5 5 6 7 7 7 8 would be 2 2 5 5 7 7 7. – YungReezy Jul 15 '19 at 12:57