0

I have some unstructured key value pairs I eventually want to store in a PostgreSQL database using the Django framework. It is mostly a debugging dump, so I don't want to query it in SQL but retrieve the whole thing and then deal with it on the Python side.

For example:

data = {
   "name": "bob", 
   "size": Decimal("173.4"), 
   "logged_in_at": datetime.now()
}

However, I can not just create a Model and a table for it because the data are unstructured and can have any field name. (I can ensure though that the name adheres to both Python and SQL syntax rules.)

Applying entity-attribute-value model would be overkill because I don't want to query single fields of my data, just retrieve the whole thing.

I cannot use JSONField or JSON in general because it cannot process datetime and Decimal reliably.

I'm could use a BinaryField to store pickle data but dread the security implications.

I could dynamically create serializers for Django REST framework and the related models but dread the necessary abstract meta voodoo and how to deal with changes in the underlying data models during runtime.

I took a quick look at MessagePack, it seems to have the same issues as JSON.

Protocol Buffers at least have a Timestamp type but do not have a decimal type. Applying ye olde "represent decimals as integers" trick would lose type information and make integer indistinguishable from decimals.

Because this is only a single field, introducing an additional NoSQL database feels like overkill. Also, at least MongoDB doesn't seem to have a native decimal datatype.

Currently I'm leaning towards writing a JSON encoder and decoder that converts all the fields to str and prefix it with type information, for example:

encoded_data = {
   "name": "string:bob", 
   "size": "decimal:173.4", 
   "logged_in_at": "datetime:2022-01-04T00:54:14.245607"  # ISO format
}

Then I could store this in a JSONField or even CharField.

However, this seems to be a lot of code, steps and CPU for such a seemingly simple task.

Are their any better solutions?

roskakori
  • 3,139
  • 1
  • 30
  • 29
  • Does this answer your question? [arbitrary typed data in django model](https://stackoverflow.com/q/2637438/90527) – outis Jan 04 '22 at 02:30
  • ... "[Django: storing/querying a dictionary-like data set?](https://stackoverflow.com/q/24688388/90527)" – outis Jan 04 '22 at 02:36
  • @outis Thanks, but not really. Entity attribute model would be overkill as I don't want to query singular fields in the data, just retrieve the whole thing. (It's mostly a structured debugging dump.) – roskakori Jan 04 '22 at 12:03
  • The question seems to be asking for a serialization format & library, one that supports some partially specified (possibly extensible, possibly limited) set of types and is safe enough with untrusted data. Such recommendations are beyond the scope of SO. – outis Jan 05 '22 at 08:38

0 Answers0