2

I defined a class named Experiment for the results of some lab experiments I am conducting. The idea was to create a sort of database: if I add an experiment, this will be pickled to a db before at exit and reloaded (and added to the class registry) at startup.

My class definition is:

class IterRegistry(type):
    def __iter__(cls):
        return iter(cls._registry)


class Experiment(metaclass=IterRegistry):
    _registry = []
    counter = 0

    def __init__(self, name, pathprotocol, protocol_struct, pathresult, wallA, wallB, wallC):
        hashdat = fn.hashfile(pathresult)
        hashpro = fn.hashfile(pathprotocol)
        chk = fn.checkhash(hashdat)
        if chk:
            raise RuntimeError("The same experiment has already been added")
        self._registry.append(self)
        self.name = name
        [...]

While fn.checkhash is a function that checks the hashes of the files containing the results:

def checkhash(hashdat):
    for exp in cl.Experiment:
        if exp.hashdat == hashdat:
            return exp
    return False

So that if I add a previously added experiment, this won't be overwritten.

Is it possible to somehow return the existing instance if already existant instead of raising an error? (I know in __init__ block it is not possible)

David
  • 513
  • 7
  • 26

3 Answers3

3

Try to do it this way (very simplified example):

class A:
    registry = {}

    def __init__(self, x):
        self.x = x

    @classmethod
    def create_item(cls, x):
        try:
            return cls.registry[x]
        except KeyError:
            new_item = cls(x)
            cls.registry[x] = new_item
            return new_item


A.create_item(1)
A.create_item(2)
A.create_item(2)  # doesn't add new item, but returns already existing one
bakatrouble
  • 1,746
  • 13
  • 19
  • Thanks for your answer. Is this good practice? With this solution, in fact, I would define all the attributes outside of `__init__` and I also need to instantiate the object by explicitly calling the method `create_item()` – David Jul 14 '17 at 12:19
  • I've edited code in answer, now it uses `__init__()` method while initializing instances. Yes, you need to call `A.create_item()` instead of `A()` but IMHO it's lesser evil than "magical" override of `__new__()` – bakatrouble Jul 14 '17 at 14:14
  • Also rewrote `registry` to `dict` collection, this way it may be more efficient to extract values from it. – bakatrouble Jul 14 '17 at 14:18
  • If you need more constructor arguments, just add them to the `create_item()` method too, and then use only necessary ones (joined in `tuple` if multiple) as keys. – bakatrouble Jul 14 '17 at 14:22
3

You can use __new__ if you want to customize the creation instead of just initializing in newly created object:

class Experiment(metaclass=IterRegistry):
    _registry = []
    counter = 0

    def __new__(cls, name, pathprotocol, protocol_struct, pathresult, wallA, wallB, wallC):
        hashdat = fn.hashfile(pathresult)
        hashpro = fn.hashfile(pathprotocol)
        chk = fn.checkhash(hashdat)
        if chk:                      # already added, just return previous instance
            return chk
        self = object.__new__(cls)   # create a new uninitialized instance
        self._registry.append(self)  # register and initialize it
        self.name = name
        [...]
        return self                  # return the new registered instance
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • Thanks. This seems to be the best solution but I have a short question: I've always knwown that overriding `__new__` is not really a good practice. From the documentation: "In general, you shouldn't need to override `__new__` unless you're subclassing an immutable type like str, int, unicode or tuple." Is it fine to initialize all attributes in `__new__`? – David Jul 14 '17 at 12:26
  • @david23: returning an existing instance when you ask a new one is not a common use case, and is one of the correct ones for an override of `__new__`, another one being immutable types. – Serge Ballesta Jul 14 '17 at 16:27
0

After four years of the question, I got here and Serge Ballesta's answer helped me. I created this example with an easier syntax.

If base is None, it will always return the first object created.

class MyClass:
    instances = []

    def __new__(cls, base=None):
        if len(MyClass.instances) == 0:
            self = object.__new__(cls)
            MyClass.instances.append(self)

        if base is None:
            return MyClass.instances[0]
        else:
            self = object.__new__(cls)
            MyClass.instances.append(self)
            # self.__init__(base)
            return self

    def __init__(self, base=None):
        print("Received base = %s " % str(base))
        print("Number of instances = %d" % len(self.instances))
        self.base = base


R1 = MyClass("apple")
R2 = MyClass()
R3 = MyClass("banana")
R4 = MyClass()
R5 = MyClass("apple")

print(id(R1), R1.base)
print(id(R2), R2.base)
print(id(R3), R3.base)
print(id(R4), R4.base)
print(id(R5), R5.base)
print("R2 == R4 ? %s" % (R2 == R4))
print("R1 == R5 ? %s" % (R1 == R5))

It gives us the result

Received base = apple 
Number of instances = 2
Received base = None 
Number of instances = 2
Received base = banana 
Number of instances = 3
Received base = None 
Number of instances = 3
Received base = apple 
Number of instances = 4
2167043940208 apple
2167043940256 None
2167043939968 banana
2167043940256 None
2167043939872 apple
R2 == R4 ? True
R1 == R5 ? False

Is nice to know that __init__ will be always called before the return of the __new__, even if you don't call it (in commented part) or you return an object that already exists.

Carlos Adir
  • 452
  • 3
  • 9