I have a Postgresql table which has time series data. I want to aggregate values over configurable periods and durations e.g. aggregate tag-n over 30-minute intervals for one day. The output should look like -
Period | Max | Min | Avg |
---|---|---|---|
00:00 | xx | xx | xx |
00:30 | xx | xx | xx |
01:00 | xx | xx | xx |
... | ... | ... | ... |
23:30 | xx | xx | xx |
To do this, I created a function to generate the custom intervals and then used this function in the query -
create or replace function interval_generator(start_ts timestamp, end_ts timestamp, round_interval INTERVAL)
returns TABLE(row_seq bigint,start_time timestamp,end_time timestamp) as $$
BEGIN
return query
SELECT row_number() over () row_seq,(n) start_time,(n + round_interval) end_time
FROM generate_series(date_trunc('minute', start_ts), end_ts, round_interval) n;
END $$
LANGUAGE 'plpgsql';
with intervals as (select * from interval_generator('2022-08-25 00:00:00','2022-08-26 23:59:59','30 Minute'::INTERVAL))
select intervals.row_seq,intervals.start_time,max("Value"),min("Value"),avg("Value")
from public.<TableName> inner JOIN intervals on "TimeStamp" >= intervals.start_time and "TimeStamp" < intervals.end_time
WHERE "Tag"=xxxxxxxxxx
GROUP BY intervals.row_seq,intervals.start_time
ORDER BY intervals.row_seq asc
I needed an additional column (first one) for the rest of the application logic. Is this the best way to get the desired output in a single query? Or is splitting the query into multiple ones, one query for each interval, better?
I am in no way an expert in SQL or Postgresql, I put together the above queries using my Google skills. It works, but I am not sure if this is optimized.