2

I want to delate the duplicated elements in a list using this structure but the list remains unchanged when I use this code. Could anyone help me please?

e.g.,item=[1,2,3,4,5,6,7,8,9,1,2,6,7]

def duplicated(item):
        i=0
        j=0
        while i<len(item):
                while j<len(item):
                        if item[j]==item[i] and i!=j:
                                del item[j]
                        j+=1
                i+=1
        return item
  • No repro. Some of the numbers are deleted; it's definitely not unchanged after the function. Are you asking why it doesn't delete _all_ duplicates? – Aran-Fey Apr 21 '18 at 08:14
  • Yes, sorry, I am asking why it doesn't delate all duplicates – Sergio Prieto Apr 21 '18 at 08:15
  • Try running it with e.g. http://pythontutor.com/visualize.html#mode=edit - some *do* get deleted, but your logic is wrong so not all pairs are compared. – jonrsharpe Apr 21 '18 at 08:17
  • Possible duplicate of [How to remove items from a list while iterating?](https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating) or as Aran-Fey mentioned on my (deleted) answer: [how-do-you-remove-duplicates-from-a-list-whilst-preserving-order](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order) – Patrick Artner Apr 21 '18 at 08:26
  • You are modifying a list while iterating it. See the dupe for why it does not work. – Patrick Artner Apr 21 '18 at 08:27
  • @PatrickArtner For the record, I wouldn't mind an answer that explains why the OP's code doesn't work. I'd just prefer if you linked to an existing question instead of (re-)posting existing solutions. – Aran-Fey Apr 21 '18 at 08:31
  • Possible duplicate of [How to remove item from a python list in a loop?](https://stackoverflow.com/questions/8312829/how-to-remove-item-from-a-python-list-in-a-loop) – Aran-Fey Apr 21 '18 at 08:34

4 Answers4

2

To address why your code isn't working, it is due to the fact that you initialise j at the start of the function, so after the first iteration of the i loop, it is len(item)-1 and then never get's reset for future loops.

This means that you miss many duplicates.

So since we need to initialise it each loop, we still need to know what to. If we initialise it as 0, then we are checking for duplicates behind the current j value in the list so this is a waste of time. To improve efficiency, we should initialise it as i+1 so that we check numbers after i for duplicates as we already know that there are no duplicates of the value at i before the index i in the list. This also simplifies the if as we no longer need to check i != j.

And one final thing is that when you delete j with del, all the indexes after are now shifted down by one. So we also need to subtract 1 from j so that we now check the element straight after the one we just deleted which is now at the same index of the one we just deleted (since they are shifted down).

So, in code:

def duplicated(item):
        i = 0
        while i < len(item):
                j = i + 1
                while j < len(item):
                        if item[j] == item[i]:
                                del item[j]
                                j -= 1
                        j += 1
                i += 1
        return item

and it works with the examples people have given:

>>> duplicated([1,2,3,4,5,6,7,8,9,1,2,6,7])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> duplicated([1,2,3,4,5,6,7,8,9,1,1,2,6,7])
[1, 2, 3, 4, 5, 6, 7, 8, 9]

However, this solution is current of complexity O(n^2) since we are using nested loops so the time taken is proportional to the size of the input list squared.

But if we were able to modify the algorithm to use only one loop, we would reduce the complexity to O(n). This can be done by using a set for lookup. Adding and checking if an element is in the set is O(1) (average case) so we can use them to speed things up.

So if we have a set which contains the elements we have already seen, then for each future element, if it is already in the seen set, we delete it, otherwise we add it to this set.

So, in code:

def duplicated(item):
    seen = set()
    i = 0
    while i < len(item):
        if item[i] in seen:
            del item[i]
        else:
            seen.add(item[i])
            i += 1
    return item

And I can confirm that this passes the test cases above as well.

One last thing I should point out is that when deleting the element here, I did not subtract from the pointer, this is because before we subtracted as we knew it would later be incremented and we wanted it to be the same, however here, it is only incremented in the else block, so if we don't do anything, it will stay the same.

Joe Iddon
  • 20,101
  • 7
  • 33
  • 54
  • Alternative solutions can be found [here](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order). – Aran-Fey Apr 21 '18 at 08:54
0

Actually variable j should start with next item of what i picks.

def duplicated(item):
    i=0
    while i<len(item):
        j = i+1
        while j<len(item):
            if item[j]==item[i] and i!=j:
                del item[j]
                j -= 1
            j+=1
        i+=1
    return item
MaNKuR
  • 2,578
  • 1
  • 19
  • 31
-1

You should reinitialize your j var at each turn inside of the i loop, otherwise, j is always going to be equal to len(item) after the first iteration.

def duplicated(item):
    i=0
    while i<len(item):
        j=0
        while j<len(item):
            if item[j]==item[i] and i!=j:
                print(i,j )
                del item[j]
            j+=1
        i+=1
    return item

However, The best way, if you don't care about your list order, to do what you want will probably be to convert your list to a set and then back to a list, has a set can only have distinct elements.

def duplicated(item):
    return list(set(item))
Axnyff
  • 9,213
  • 4
  • 33
  • 37
-1

You need to reinitilaze j at start of nested while loop each time:

def duplicated(item):
        i=0
        j=0
        while i<len(item)-1:
            j=i+1
            while j<len(item):
                    if item[j]==item[i]:
                            del item[j]
                            j -= 1
                    j+=1
            i+=1
        return item

OUT

[1, 2, 3, 4, 5, 6, 7, 8, 9]

However you can try below simpler code it will remain the insertion order of list

def duplicated(item):
    unique_list=[]
    for i in item:
        if i not in unique_list:
            unique_list.append(i)
    return unique_list
Roushan
  • 4,074
  • 3
  • 21
  • 38
  • This throws an IndexError with the input `[1,2,1,1]`. Also, the 2nd solution is horrible because it's O(n^2). Please link to [an existing question](https://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-whilst-preserving-order) for solutions instead of posting your own (bad) solutions. – Aran-Fey Apr 21 '18 at 08:35
  • @Aran-Fey it should be fine for novice , as first they should focus on working soln and later can do focus on efficiency – Roushan Apr 21 '18 at 08:49
  • @Aran-Fey using one for loop how it will be O(n^2) – Roushan Apr 21 '18 at 08:50
  • `if i not in unique_list` <- there's your 2nd loop. And that information could be found in the question I linked. That's why I'd prefer if you didn't repost existing solutions. All the strengths and weaknesses have already been pointed out and voted on. I'm not interested in having to repeat all my complaints because people keep reposting the same inefficient code. – Aran-Fey Apr 21 '18 at 08:52
  • @Aran-Fey the question he you refer asked explicitly for efficiency time but this question needs the code to work, and the code he used using while loop are more complex than i proposed alternate one – Roushan Apr 21 '18 at 08:55