I have some unstructured key value pairs I eventually want to store in a PostgreSQL database using the Django framework. It is mostly a debugging dump, so I don't want to query it in SQL but retrieve the whole thing and then deal with it on the Python side.
For example:
data = {
"name": "bob",
"size": Decimal("173.4"),
"logged_in_at": datetime.now()
}
However, I can not just create a Model
and a table for it because the data are unstructured and can have any field name. (I can ensure though that the name adheres to both Python and SQL syntax rules.)
Applying entity-attribute-value model would be overkill because I don't want to query single fields of my data, just retrieve the whole thing.
I cannot use JSONField or JSON in general because it cannot process datetime
and Decimal
reliably.
I'm could use a BinaryField to store pickle data but dread the security implications.
I could dynamically create serializers for Django REST framework and the related models but dread the necessary abstract meta voodoo and how to deal with changes in the underlying data models during runtime.
I took a quick look at MessagePack, it seems to have the same issues as JSON.
Protocol Buffers at least have a Timestamp
type but do not have a decimal type. Applying ye olde "represent decimals as integers" trick would lose type information and make integer indistinguishable from decimals.
Because this is only a single field, introducing an additional NoSQL database feels like overkill. Also, at least MongoDB doesn't seem to have a native decimal datatype.
Currently I'm leaning towards writing a JSON encoder and decoder that converts all the fields to str
and prefix it with type information, for example:
encoded_data = {
"name": "string:bob",
"size": "decimal:173.4",
"logged_in_at": "datetime:2022-01-04T00:54:14.245607" # ISO format
}
Then I could store this in a JSONField
or even CharField
.
However, this seems to be a lot of code, steps and CPU for such a seemingly simple task.
Are their any better solutions?