I have a system producing about 5TB of time-tagged numeric data every year. The fields tend to be different for each row, and to avoid having heaps of NULLs I'm thinking of using Postgres as a document store with JSONB.
However, GIN indexes on JSONB fields don't seem to be made for numerical and datetime data. There are no inequality or range operators for numbers and dates.
Here they suggest making special constructs with LATERAL
to treat JSON values as normal numeric columns, and here someone proposes using a "sortable" string format for dates and filter string ranges.
These solutions sound a bit hacky and I wonder about their performance. Perhaps this is not a good application for JSONB?
An alternative I can think of using a relational DB is to use the 6th normal form, making one table for each (optional) field, of which however there would be hundreds. It sounds like a big JOIN mess, and new tables would have to be created on the fly any time a new field pops up. But maybe it's still better than a super-slow JSONB implementation.
Any guidance would be much appreciated.
More about the data
The data are mostly sensor readings, physical quantities and boolean flags. Which subset of these is present in each row is unpredictable. The index is an integer, and the only field that always exists is the corresponding date.
There would probably be one write for each value and almost no updates. Reads can be frequent and sliced based on any of the fields (some are more likely to be in a WHERE statement than others).