0

My task is to use a set to convert a list of duplicates to a list of unique numbers. However, I want to retain the positions.

Simple enough I thought; so I made a dictionary that first stores the original list's positions.

def get_positions(a): 
    positions = {}

    for ele in a: 
        if not ele in positions:
            positions[ele] = a.index(ele) 

    return positions

So lets say I have a list a = [1, 2, 4, 4, 5]

Positions will give me a dictionary of {0:1, 1:2, 2:4, 3:4, 4:5}.

This however was unsuccessful because I repeated numbers will not get their positions stored.

Is there a way of achieving this?

Thanks.

UPDATE:

It seems I wasn't clear. I need to use a set. So, I get a list a=[1,2,4,4,5] and I must convert it to a set to erase the duplicates. Then, I need to get a list with the elements in the same order. (It's an assignment problem)

user2417731
  • 179
  • 3
  • 14
  • do you just want the original numbers in the order they first appeared in the original list? – ModulusJoe Oct 08 '13 at 15:03
  • Yes, so for the example with a, I want a list returned that has [1, 2, 4, 5] – user2417731 Oct 08 '13 at 15:03
  • 1
    possible duplicate of [How do you remove duplicates from a list in Python whilst preserving order?](http://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-python-whilst-preserving-order) – mgilson Oct 08 '13 at 15:04
  • 1
    Just so you know, you keep saying that you're "using a set", but `positions` is defined as `{}`, which is an empty "dictionary" or "mapping", not a set. – Mark Hildreth Oct 08 '13 at 15:05
  • Is this an arbitary assignment for a programming class? If it is there a number of ways you could complete this with the use of a 'set'. Can you provide the full details of the requirements as I think we are missing something. – ModulusJoe Oct 08 '13 at 15:17

5 Answers5

6

You can use an OrderedDict:

>>> from collections import OrderedDict
>>> 
>>> a = [1, 2, 4, 4, 5]
>>> 
>>> list(OrderedDict.fromkeys(a))
[1, 2, 4, 5]

You can also use a plain set for this. One common way is:

>>> a = [1, 2, 4, 4, 5]
>>> 
>>> seen = set()
>>> [x for x in a if x not in seen and not seen.add(x)]
[1, 2, 4, 5]

The trick here is that not seen.add(x) will always be True because add() always returns None. In practice, I would always use the OrderedDict approach.

See also: How do you remove duplicates from a list in Python whilst preserving order?

Community
  • 1
  • 1
arshajii
  • 127,459
  • 24
  • 238
  • 287
  • 2
    You could use `.fromkeys()`. – DSM Oct 08 '13 at 15:08
  • I updated my question. Please see. I am required to use a set. – user2417731 Oct 08 '13 at 15:11
  • Arshaji So in second trick you are using short-circuit behaviour of and. am I correct? – Grijesh Chauhan Oct 08 '13 at 15:31
  • @arshajii Thanks Arshjit your answer is helpful. I am still python learner. I just read [this comment](http://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-python-whilst-preserving-order#comment17454689_480227) So your first technique is better than second one if I correctly understands. – Grijesh Chauhan Oct 08 '13 at 15:41
2

I think you are going about this the wrong way. You are trying to remove duplicates from a list, but you are having a problem which you are trying to solve by having the positions of things in the list without duplicates being removed. Instead, I think it would be better to do something more like this:

def remove_duplicates(seq):
    new_list = []
    for i in seq:
        if i not in new_list:
            new_list.append(i)
    return new_list

One fairly readable way to do this, using sets (with the corresponding O(1) membership testing (but with higher memory usage) is this:

def remove_duplicates(seq):
    seen = set()
    new_list = []
    for i in seq:
        if i not in seen:
            new_list.append(i)
            seen.add(i)
    return new_list

This answer to the same question also uses sets, and is quite possibly faster (but using a bit hacky in using and not set.add).

Community
  • 1
  • 1
rlms
  • 10,650
  • 8
  • 44
  • 61
  • 2
    The problem with this one is that membership testing on a list is O(n) which makes this an O(n^2) algorithm -- when really, you can accomplish this task in O(n) operations using sets or dicts. – mgilson Oct 08 '13 at 15:06
  • @mgilson But in terms of readability (except for me calling the parameter x), and quite possibly in terms of memory usage (you don't have to store the positions of the elements as well) I think this is better. – rlms Oct 08 '13 at 15:09
  • And after all, "premature optimization is the route of all evil"! – rlms Oct 08 '13 at 15:10
  • I updated my question. Please see. I am required to use a set. – user2417731 Oct 08 '13 at 15:11
  • 2
    @user2417731 -- You could easily modify this to use a set as well. Just keep a set in addition to `new_list` and then do the membership test against the `set`. – mgilson Oct 08 '13 at 15:13
  • @user2417731 -- premature optimization is only evil when it takes a long time or makes the code less easy to read. For something this simple, the additional 2 lines of code (literally) and few extra bytes of memory are well worth the linear performance compared to quadratic performance. – mgilson Oct 08 '13 at 15:16
  • @mglison I have modified it just like that! – rlms Oct 08 '13 at 15:17
  • 1
    @user2387370: `seen = {}` makes seen a `dict`, not a `set`. I think you'd need `seen = set()`. – DSM Oct 08 '13 at 15:18
1

This can be done with a loop and an if statement:

>>> oldlist = [1,2,3,3,4,5,4,5,6,2,3,5,7,8,3,3,3,9]
>>> newlist = []
>>> for x in oldlist:
...  if not x in newlist:
...   newlist.append(x)
...
>>> newlist
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
ModulusJoe
  • 1,416
  • 10
  • 17
0

Store it as a set of tuples with (position, element)

Joel
  • 5,618
  • 1
  • 20
  • 19
0

you want an OrderedSet. but this sounds like a homework problem, and i don't know if they would accept that.

Corley Brigman
  • 11,633
  • 5
  • 33
  • 40