In general, Python sets don't seem to be designed for retrieving items by key. That's obviously what dictionaries are for. But is there anyway that, given a key, you can retrieve an instance from a set which is equal to the key?
Again, I know this is exactly what dictionaries are for, but as far as I can see, there are legitimate reasons to want to do this with a set. Suppose you have a class defined something like:
class Person:
def __init__(self, firstname, lastname, age):
self.firstname = firstname
self.lastname = lastname
self.age = age
Now, suppose I am going to be creating a large number of Person
objects, and each time I create a Person
object I need to make sure it is not a duplicate of a previous Person
object. A Person
is considered a duplicate of another Person
if they have the same firstname
, regardless of other instance variables. So naturally the obvious thing to do is insert all Person
objects into a set, and define a __hash__
and __eq__
method so that Person
objects are compared by their firstname
.
An alternate option would be to create a dictionary of Person
objects, and use a separately created firstname
string as the key. The drawback here is that I'd be duplicating the firstname
string. This isn't really a problem in most cases, but what if I have 10,000,000 Person
objects? The redundant string storage could really start adding up in terms of memory usage.
But if two Person
objects compare equally, I need to be able to retrieve the original object so that the additional instance variables (aside from firstname
) can be merged in a way required by the business logic. Which brings me back to my problem: I need some way to retrieve instances from a set
.
Is there anyway to do this? Or is using a dictionary the only real option here?