It very much depends on the use case and requirements.
1.
24 individual float
columns, possibly nullable (your first option) are the optimum for storage size (and hence speed). 8 bytes per float
(defaults to float8
a.k.a. double precision
) or 4 bytes per real
(float4
) and no overhead. Since there are exactly 24 given columns, you'll never run into design problems, like you would with a variable number.
A legal identifier in SQL can't start with a number. You would have to always double-quote. Error-prone and confusing. Use a character prefix, like h0
.. h23
.
2.
A one-to-many table with a FK constraint to id
ot the main table. You would make that (id int, hour_of_the_day int, value float8)
, not (id, value, hour_of_the_day)
, no space wasted for alignment padding. Add a CHECK
constraint to only allow values 0 - 23 for hour_of_the_day
.
This multiplies storage size as compared to 1.: each separate row occupies around 44 bytes. 28 bytes of overhead per row + 16 bytes for the data. You at least add a PK constraint on (id, hour_of_the_day)
which is implemented using a btree index (similar page and row structure as tables). This adds another 36 bytes per row. So we end up with at least 80 bytes per value (plus overhead per data page and table). Roughly 10x the storage size of 1..
3.
Arrays? Don't. Storage is more expensive than 1. (array overhead of 24 bytes), handling is least convenient, enforcing integrity (max. / exact length) is not as simple.
Some more pros and cons
With 2.
you need a join of two tables to read data, which is more expensive. But there are various advantages, too. If you often need to read or write the main table without columns h0
- h23
, you get that a bit cheaper. Updating individual values may be cheaper, since the main table remains unchanged. OTOH, inserting or updating all 24 values of the same id
at once is typically more expensive (1 new row version vs. 24 new row versions). Deleting individual values is cheaper for 2..
The values will mostly be queried together ...
That's a lot cheaper with 1..
... and scaled
That may be simpler with 2.: applying the same operation to one column vs. 24 different columns.
Etc. As I said, it very much depends on the use case and requirements.
Related: