2

I'm new to TimescaleDB and relational databases in general.

I have data in the form

Timestamp Sensor name 1 Sensor name2 ... Sensor name 300
yyyy-mm-dd 00:01 22.5 25 ... sensor 300 value
yyyy-mm-dd 00:02 ...

but since I want to store metadata about each sensor (location, type etc) in a second table, I have stored this in TimescaleDB as

timestamp sensor_id value
yyyy-mm-dd 00:01 1 22.5
yyyy-mm-dd 00:01 2 25

This is fine, although I don't like having the timestamp duplicated 300 times.

But, I need a query that will return data in the original form, one row per unique timestamp, sensor names (from the second table) as columns. I feel like this must be a very normal operation with this kind of data, but I can't for the life of me find out how to do it.

So two questions: is this actually a good way to store data like this in TimescaleDB. From what I can see in the documentation, the raw form of the data is 'wide table' and I could just store it like that, but then I don't know how to link metadata to individual columns. The way I'm storing it now is a 'narrow table', but then I need a way to reconstruct the 'wide' form. Is there a preferred approach in Timescale?

How would I construct a query to get the data in the original 'wide' form? It's essentially a pivot, so would I use crosstab? Again, this seems like it shouldn't be a strange thing to do, is there a common way that timescale users approach this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • If you're going to group your data by some timeframe using the `time_bucket` function, you can reuse it to join. The question is: if you always need the data aggregated, maybe it is worth saving them in the same table. – jonatasdp Oct 13 '21 at 11:11

1 Answers1

1

The wide representation can be reconstructed with help of crosstab function from tablefunc module, see this reply. Another approach is to use as many self-joins as columns, i.e., about 300 in the described case, which is likely be horrible for performance. The first solution might also hit performance issues. And both approaches might not allow to use some TimescaleDB's functionality.

Since the data originate from the wide form, I suggest to store data in this way and preserve the original semantics, that sensor values come together with the same timestamp. This will also remove the need to reconstruct this representation.

In general, the choice between the representations highly depends on how data will be queried. The question presents only one such requirement. Also it is good to consider that it might be worth to preserve the original data semantics.

k_rus
  • 2,959
  • 1
  • 19
  • 31