I would like to re-implement some of my existing SQLAlchemy models in an append-only datastore; append-only meaning that object are only updated with INSERT statements, not using UPDATE or DELETE statements.
The UPDATE and DELETE statements would be replaced with another INSERT that increments the version. There would be an is_deleted
flag and instead of DELETE, a new version with is_deleted=True
would be created:
id | version | is_deleted | name | description ...
---- --------- ------------ ----------- ---------------
1 | 1 | F | Fo | Text text text.
1 | 2 | F | Foo | Text text text.
2 | 1 | F | Bar | null
1 | 3 | T | Foo | Text text text.
Additionally,
- All SELECT statements will need to be rewritten to only the maximum version number for each id, as described in this question: PostgreSQL - fetch the row which has the Max value for a column
- All (unique) indexes need to be rewritten to be unique by the "id" primary key, as each id may be present more than once.
I know how to solve most of these issues, but I am struggling with the event hooks in SQLAlchemy that would handle certain things that need to be done on update & delete.
The SQLAlchemy documentation already has some basic examples for versioning. The versioned rows example comes close to what I want, but they do not handle (1) deletion and (2) foreign key relationships.
(1) Deletion. I know there is a session.deleted
field, and I would iterate over it in a similar way to how session.dirty
is iterated over in the versioned_rows.py example—but how would I unflag the item from the to-be-deleted list & create a new item?
(2) The above-mentioned example only deals with a parent-child relationship, and the way it does (expiring the relationship) seems to require custom code for each model. (2.1) Is there a way to make this more flexible? (2.2) is it possible to configure SQLAlchemy's relationship()
to return the object with max(version) for a given foreign key?