Iterator Object for Removing Duplicates in Python

Question

Hi so I'm trying to figure out how to create an iterator object using Python that would remove duplicates or more so omit duplicates.

For example I have a list (1, 2, 3, 3, 4, 4, 5) and I get (1, 2, 3, 4, 5)

I understand that in order to get an iterator object I have to create it. So:

Class Unique:
    def __init__(self, n):
         self.i = 0
         self.n = n  

    def __iter__(self):
         return self

    def __next__(self):
        if self.i < self.n:

I'm actually not entirely sure what to do next in this problem. Thanks in advance for any comments or help!

Why create a new class instead of using a `set` or subclassing `set`? — Deacon, Aug 14 '15 at 15:00
It more of an exercise to implement using Iterator objects (for a quiz in my class) and it helps to understand how iterators work. I understand using 'set' as a built-in function but if someone could please help me in writing this it would be greatly appreciated. — d'chang, Aug 14 '15 at 15:02

score 8 · Accepted Answer · edited May 23 '17 at 11:44

Better create a generator function, like this

>>> def unique_values(iterable):
...     seen = set()
...     for item in iterable:
...         if item not in seen:
...             seen.add(item)
...             yield item
...

And then you can create a tuple of unique values, like this

>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

If you know for sure that the data will be always sorted, then you can avoid creating the set and keep track of the previous data only, like this

>>> def unique_values(iterable):
...     it = iter(iterable)
...     previous = next(it)
...     yield previous
...     for item in it:
...         if item != previous:
...             previous = item
...             yield item
>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

You can write an iterator object, with a class, like this

>>> class Unique:
...     def __init__(self, iterable):
...         self.__it = iter(iterable)
...         self.__seen = set()
... 
...     def __iter__(self):
...         return self
... 
...     def __next__(self):
...         while True:
...             next_item = next(self.__it)
...             if next_item not in self.__seen:
...                 self.__seen.add(next_item)
...                 return next_item
... 
>>> for item in Unique((1, 2, 3, 3, 4, 4, 5)):
...     print(item)
... 
1
2
3
4
5

You can refer this answer, and the Iterator Types section in Python 3 Data Model documentation

I understand how to create the generator function but I was wondering how to create an iterator object for the same method. I'm reviewing for a test so it would be really helpful! — d'chang, Aug 14 '15 at 14:53
Keep in mind this assumes the items in `iterable` are hashable (so that they can be added to the set). — chepner, Aug 14 '15 at 14:54
@thefourtheye Thanks that makes a lot of sense and thanks for the refer answer! — d'chang, Aug 14 '15 at 15:04

score 0 · Answer 2 · answered Aug 14 '15 at 15:04

0

If preserving original order is not important, simply use set:

values = (1, 3, 2, 5, 4, 3)
unique_values = set(values)
print unique_values
(1, 2, 3, 4, 5)

answered Aug 14 '15 at 15:04

Jimothy

9,150
5
30
33

score 0 · Answer 3 · answered Apr 23 '23 at 01:18

0

This should remove all duplicates

new_stuff = type(old_stuff)(set(old_stuff))

answered Apr 23 '23 at 01:18

Toothpick Anemone

4,290
2
20
42

Iterator Object for Removing Duplicates in Python

3 Answers3

This should remove all duplicates

Linked

Related