I have the following Query for my timescaleDB that I want to optimize
SELECT
sensor_reading.day as day,
sensor_delta_daily,
pos1_daily,
(sensor_delta_daily * 1 * calorific_value_daily) AS calcval1_daily
FROM (
WITH sensor_counter AS (
SELECT
time,
(LEAD(sensor_delta) OVER (ORDER BY time)/DATE_PART('day', LEAD(time) OVER (ORDER BY time) - time)) as sensor_delta_daily
FROM sensor_counter
WHERE sensor_delta is not null
ORDER BY time
)
SELECT
time_bucket_gapfill('1 day', time) AS day,
locf(avg(reading_delta_daily)) AS reading_delta_daily
FROM energy_counter
WHERE time between (now() - INTERVAL '2 months') and now()
GROUP BY day
) AS counter_reading,
(SELECT
time_bucket_gapfill('1 day', time) AS day,
locf(avg(calorific_value)) AS calorific_value_daily
FROM energy_counter_daily_nullable
WHERE time between (now() - INTERVAL '2 months') AND now()
AND calorific_value is not null
GROUP BY day
) AS counter_calorific
WHERE counter_reading.day = counter_calorific.day
;
MY QUESTION IS:
- Since I need way more positions in the future and this Query will grow bigger and bigger, is it possible to optimize it by using contiuous aggregates and materialized views?
- If yes: how would this look like?
- If no: are there other, maybe better options?
EXPLANATION FOR A MORE DETAILED UNDERSTANDING:
I'm a newbie to databases and especially timescaleDB and have a little sensor project with the following scenario:
I have a sensor which returns a consumption value. Due to some technical limitations of the sensor (big values leading to overflows where the sensor restarts at the nullpoint), I simply store the difference between the current and last capture of the sensor (sensor_delta). This value is only valid the day I inserted it. But I'm able to fill the days between two consumption values with valid values by dividing the newest value by the number of days to the consumption value before.
There are also other timebased datas connected to the sensor but captured and stored by myself (pos0_value, pos1_price, pos2_price ). These are values that change on different times due to different events and they're valid from the time I inserted the value to the time I inserted a new one. All these data are important for calculations I need.
So, this was a lot of jibberish talk. For a better understanging, here's the plain data I have:
The actual data:
time | sensor_uuid | sensor_delta | pos0_value | pos1_value | pos2_value |
---|---|---|---|---|---|
2021-06-20 00:00:00 | f5778d7c-46a4-3d6b-8b40-133569cd2f01 | 514.91 | 9.85 | 0.3462 | 19.8263 |
2021-06-23 00:00:00 | f5778d7c-46a4-3d6b-8b40-133569cd2f01 | NULL | NULL | 0.2228 | NULL |
2021-06-27 00:00:00 | f5778d7c-46a4-3d6b-8b40-133569cd2f01 | NULL | NULL | NULL | 18.8928 |
2021-06-30 00:00:00 | f5778d7c-46a4-3d6b-8b40-133569cd2f01 | 560.28 | NULL | NULL | NULL |
What I want to do with these data is the following:
- calculate the daily sensor_delta from the newest to the previous entry (sensor_deltanew)
- get the valid value for the queried day or time-range
- calculate new data based on the recorded ones for each day
The dates I need:
- calcval1_daily = sensor_delta_daily*pos1_daily+pos0_daily
- calcval2_daily = sensor_delta_daily+pos2_daily+calcval1_daily
day | sensor_delta_daily | pos0_daily | pos1_daily | pos2_daily | calcval1_daily | calcval2_daily |
---|---|---|---|---|---|---|
2021-06-20 | 514.91 | 9.85 | 0.3462 | 19.8263 | 188,111842 | 722,848142 |
2021-06-21 | 56.028 | 9.85 | 0.3462 | 19.8263 | 29,2468936 | 105.1011936 |
2021-06-22 | 56.028 | 9.85 | 0.3462 | 19.8263 | 29,2468936 | 105.1011936 |
2021-06-23 | 56.028 | 9.85 | 0.2228 | 19.8263 | 22,3330384 | 98,1873384 |
2021-06-24 | 56.028 | 9.85 | 0.2228 | 19.8263 | 22,3330384 | 98,1873384 |
2021-06-25 | 56.028 | 9.85 | 0.2228 | 19.8263 | 22,3330384 | 98,1873384 |
2021-06-26 | 56.028 | 9.85 | 0.2228 | 19.8263 | 22,3330384 | 98,1873384 |
2021-06-27 | 56.028 | 9.85 | 0.2228 | 18.8928 | 22,3330384 | 97,2538384 |
2021-06-28 | 56.028 | 9.85 | 0.2228 | 18.8928 | 22,3330384 | 97,2538384 |
2021-06-29 | 56.028 | 9.85 | 0.2228 | 18.8928 | 22,3330384 | 97,2538384 |
2021-06-30 | 56.028 | 9.85 | 0.2228 | 18.8928 | 22,3330384 | 97,2538384 |
2021-07-01 | NULL | 9.85 | 0.2228 | 18.8928 | NULL | NULL |
2021-07-02 | NULL | 9.85 | 0.2228 | 18.8928 | NULL | NULL |
2021-07-03 | NULL | 9.85 | 0.2228 | 18.8928 | NULL | NULL |
As soon as the next sensor_delta has been inserted, the sensor_delta_daily should be updated until 2021-07-01.
I know, this was a lot. I also got my hands on this scenario and the only solution I came up with so far is the Query mentioned above which kinda returns the dataset I want (I left out the pos2_value and calcval2_daily). But the more datapositions I have, the more the Query will stack up and I have a feeling the Query get's reeeally big, unreadable and slow.
Do you have any ideas how to solve this scenario?