Removing earlier duplicates from a list and keeping order

Question

I want to define a function that takes a list as an argument and removes all duplicates from the list except the last one.

For example: remove_duplicates([3,4,4,3,6,3]) should be [4,6,3]. The other post answers do not solve this one.

The function is removing each element if it exists later in the list. This is my code:

def remove(y):
    for x in y:
        if y.count(x) > 1:
            y.remove(x)
            
    return y

and for this list: [1,2,1,2,1,2,3] I am getting this output: [2,1,2,3]. The real output should be [1,2,3]. Where am I going wrong and how do I fix it?

Use list(set(a)). This should work, but wonk, but output will be in ascending order. — Abhay, Jul 18 '20 at 00:28
@Abhay no, that's not what OP wants. Converting to `set` destroys order. Also please don't answer questions in the comments. — wjandrea, Jul 18 '20 at 00:29
@Abhay this gives output in order. Thank you though for trying :) — neuops, Jul 18 '20 at 00:31
Getting `[4,6,3]` from `remove_duplicates([3,4,4,3,6,3])` is ***not*** preserving the order of the values in the list being passed as an argument. — martineau, Jul 18 '20 at 01:04

wjandrea · Accepted Answer · 2020-07-18T02:37:20.500

1

The other post does actually answer the question, but there's an extra step: reverse the input then reverse the output. You could use reversed to do this, with an OrderedDict:

from collections import OrderedDict

def remove_earlier_duplicates(sequence):
    d = OrderedDict.fromkeys(reversed(sequence))
    return reversed(d)

The output is a reversed iterator object for greater flexibility, but you can easily convert it to a list.

>>> list(remove_earlier_duplicates([3,4,4,3,6,3]))
[4, 6, 3]
>>> list(remove_earlier_duplicates([1,2,1,2,1,2,3]))
[1, 2, 3]

BTW, your remove function doesn't work because you're changing the size of the list as you're iterating over it, meaning certain items get skipped.

edited Jul 18 '20 at 02:37

answered Jul 18 '20 at 00:58

wjandrea

28,235
9
60
81

Welcome! BTW I rejected your edit cause the output is intentionally a reversed iterator object for greater flexibility, like I said. For example you can use it on a bytes object and convert it back to a bytes object on the way out, without an intermediate list: `bytes(remove_earlier_duplicates(b'344363'))` -> `b'463'` – wjandrea Jul 18 '20 at 02:38
Okay, cool! I suggested the edit because the question wanted output in a list form. This is just as good though! – neuops Jul 18 '20 at 02:51

score 1 · Answer 2 · edited Jul 18 '20 at 02:29

1

I found this way to do after a bit of research. @wjandrea provided me with the fromkeys method idea and helped me out a lot.

def retain_order(arr): 
    return list(dict.fromkeys(arr[::-1]))[::-1]

edited Jul 18 '20 at 02:29

wjandrea

28,235
9
60
81

answered Jul 18 '20 at 02:02

neuops

342
3
16

1

Nice, much simpler! However, [`dict` only preserves order in Python 3.7+ and CPython 3.6](https://stackoverflow.com/q/39980323/4518341). – wjandrea Jul 18 '20 at 02:29

Removing earlier duplicates from a list and keeping order

2 Answers2