Delete similar objects in list python

Question

I have a bunch of Album objects in a list (code for objects posted below). 5570 to be exact. However, when looking at unique objects, I should have 385. Because of the way that the objects are created (I don't know if I can explain it properly), I thought it would be best to add all the objects into the list, and then delete the ones that are similar afterwards.

Certain objects have the same strings for each argument(artist, title, tracks) and I would like to get rid of them. However, I know I cannot simply remove the duplicates, since they are stored in separate memory locations, and therefore aren't exactly identical.

Can anyone help me with removing the duplicates?

As you can probably tell, I am quite new to python.

Thanks in advance!

class Album(object) :
    def __init__(self, artist, title, tracks = None) :
        tracks = []
        self.artist = artist
        self.title = title
        self.tracks = tracks

    def add_track(self, track) :
        self.track = track
        (self.tracks).append(track)
        print "The track %s was added." % (track)

    def __str__(self) :
        return "Artist: %s, Album: %s [" % (self.artist, self.title) + str(len(self.tracks)) + " Tracks]"

score 0 · Answer 1 · answered Oct 05 '14 at 03:34

0

You can make your class hashable over tuple (artist, title, tracks) and store objects in set, which will keep only unique objects.

answered Oct 05 '14 at 03:34

m.wasowski

6,329
1
23
30

score 0 · Answer 2 · edited May 23 '17 at 12:20

While the other answer addresses the underlying issue of removing duplicates, it doesn't allow you to keep hold of your Album class, which can prove useful in the future (or even now, via its __str__ method). Therefore, I think you should consider implementing the __eq__ method to compare objects of the Album class. One way to implement it, along with the __ne__ method, would be:

def __eq__(self, other):
    # assuming tracks were added in the same order
    return type(other) is self.__class__ and other.artist == self.artist and other.title == self.title and other.tracks == self.tracks

def __ne__(self, other):
    return not self.__eq__(other)

Note that explicitly checking the types rather than testing whether one object is an instance of the other's class could save you from a dangerous pitfall with inheritance where the order of equality evaluation would matter while it shouldn't (e.g. a == b and b == a return different values).

An alternative generic solution, which would work for simple container classes, like the one you have, can be found here:

def __eq__(self, other):
    return type(other) is self.__class and other.__dict__ == self.__dict__

If you implement the __hash__ method as well, you could just add your object into a set which guarantees that there are no duplicates. Here is a suggested generic implementation for simple container classes like yours:

def __hash__(self):
    """Override the default hash behavior (that returns the id or the object)"""
    return hash(tuple(sorted(self.__dict__.items())))

You can also check this out for suggested implementations.

A few additional remarks regarding your code:

There is no point in accepting an argument for tracks in your __init__ method if you're overriding it anyway with an empty list.
There is no point in setting self.track in your add_track method since it is not used anywhere, and would be overridden in add_track's next invocation. There is also no need for that parenthesis around self.tracks. Your method should look like this:
```
def add_track(self, track) :
    self.tracks.append(track)
    print "The track %s was added." % (track)
```

Your string representation method needs a bit fixing up.

def __str__(self) :
    return "Artist: %s, Album: %s [%d tracks]" % (self.artist, self.title, len(self.tracks))

Delete similar objects in list python

2 Answers2