I have a ZODB installation where I have to organize several million objects of about a handful of different types. I have a generic container class Table
, which contains BTrees to index objects by attributes or combinations of these attributes. Data consistency is quite essential, and so I want to enforce, that the indices are automatically updated, when I write to any of the attributes, which are covered by the indexing. So a simple obj.a = x
should be sufficient to calculate all new dependent index entries, check if there are any collisions, and finally write the indices and the value.
In general, I'd be happy to use a library for that, so I was looking at repoze.catalog and IndexedCatalog, but was not really happy with that. IndexedCatalog seems dead for quite a while, and not providing the kind of consistency for changes to the objects. repoze.catalog seems to be more used and active, but also not providing this kind of consistency, as far as I understand. If I missed something here, I'd love to hear about it and prefer reusing over reinventing.
So, how I see it besides trying to find a library for the problem, I'd have to intercept the write access to the dataobject attributes with descriptors and let the Table
class do the magic of changing the indices. For that, the descriptor instances have to know, with which Table
instances they have to talk with. The current implementation goes someting like that:
class DatabaseElement(Persistent):
name = Property(constant_parameters)
...
class Property(object):
...
def __set__(self, obj, name, val):
val = self.check_value(val)
setattr(obj, '_' + name, val)
When these DatabaseElement
classes are generated, the database and the objects within are not yet created. So as mentioned in this nice answer, I'd probably have to create some singleton lookup mechanism, to find the Table
objects, without handing them to Property
as an instantiation argument. Is there a more elegant way? Persisting the descriptors itself? Any suggestions and best-practice examples welcome!