I'm looking forward to optimize a query i have. The tablese are:
static_data: historical configuration data of items
id(serial pk) | timestamp(default now() indexed) | tag(varchar 255 fk from tag indexed) | prop(varchar 32) | value(text) |
---|---|---|---|---|
1 | 2022-03-11 03:45:01 | AA-12 | max_value | 100 |
2 | 2022-03-11 03:45:01 | AA-12 | alarm_high | 90 |
3 | 2022-03-11 03:45:01 | AA-12 | alarm_low | 5 |
4 | 2022-03-11 03:45:01 | AA-12 | min_value | -30 |
5 | 2022-03-11 03:45:01 | AA-12 | card_type | 200HL-A |
6 | 2022-03-11 03:45:01 | BB-14 | max_value | 5.5 |
7 | 2022-03-11 03:45:01 | BB-14 | min_value | 1.1 |
8 | 2023-01-12 12:22:56 | AA-12 | alarm_low | <null> |
9 | 2023-01-12 12:22:56 | AA-12 | card_type | 300H-BS |
10 | 2023-01-12 12:22:56 | AA-12 | alarm_high | 80 |
tag: tag list (there are other columns that are not related to this query)
tag(varchar 255 pk) |
---|
AA-12 |
BB-14 |
The purpose of static_data is to provide historical data about items configuration. In the example above the item AA-12 was changed on 2022-03-11 03:45:01, max_value, min_value, alarm_high and alarm_low were set. The next year it was changed again, alarm_high was changed and alarm_low was removed because the sensor hw was updated and the new hw does not have alarm_low.
To make use of this data i need a query to retrieve the configuration of all items at any time.
Currently I'm using this query:
SELECT sd.tag,
jsonb_object_agg(sd.prop, sd.value) AS data
FROM ( SELECT DISTINCT ON (static_data.tag, static_data.prop) static_data.id,
static_data."timestamp",
static_data.tag,
static_data.prop,
static_data.value
FROM hierarchy.static_data
WHERE static_data."timestamp" < now()
ORDER BY static_data.tag, static_data.prop, static_data."timestamp" DESC) sd
WHERE sd.value IS NOT NULL
GROUP BY sd.tag
Which results in:
tag | data |
---|---|
AA-12 | {"max_value":"100", "min_value":"-30", "alarm_high": "80", "card_type":"300H-BS"} |
BB-14 | {"max_value":"5.5", "min_value":"1.1"} |
The result is as exactly what i need, for every tag a jsonb which has for each property the most up-to-date value before <timestamp>
without the nulls, the problem being the query is too slow, it takes 200-220ms to run on my test data-sample 1224tags with 6 props each.
Tag count in production should be 6k but in future projects can be as high as 10-15k, each item will have about 30-40 props depending on item type, so the static_data table scales pretty fast with time.
I cannot save the entire jsonb with the tag config every time it is changed because i need the prop that is changed or removed in reports about config change and in dashboard for every tag.prop to display the historical value.
This issue is similar to mine and accepted answer in section 2 addresses my issue partially, but my result should be grouped by tag and prop first, then by tag to aggregate the config in a single jsonb.
EDIT:
EXPLAIN(ANALYZE, VERBOSE, BUFFERS):
GroupAggregate (cost=1013.39..1116.72 rows=200 width=38) (actual time=14.228..20.671 rows=1224 loops=1)
Output: sd.tag, jsonb_object_agg(sd.prop, sd.value)
Group Key: sd.tag
Buffers: shared hit=110
-> Subquery Scan on sd (cost=1013.39..1108.13 rows=1218 width=28) (actual time=14.217..16.450 rows=6811 loops=1)
Output: sd.id, sd."timestamp", sd.tag, sd.prop, sd.value
Filter: (sd.value IS NOT NULL)
Rows Removed by Filter: 1
Buffers: shared hit=110
-> Unique (cost=1013.39..1095.89 rows=1224 width=44) (actual time=14.214..15.867 rows=6812 loops=1)
Output: NULL::bigint, static_data."timestamp", static_data.tag, static_data.prop, static_data.value
Buffers: shared hit=110
-> Sort (cost=1013.39..1040.89 rows=11000 width=44) (actual time=14.214..14.609 rows=11000 loops=1)
Output: NULL::bigint, static_data."timestamp", static_data.tag, static_data.prop, static_data.value
Sort Key: static_data.tag, static_data.prop, static_data."timestamp" DESC
Sort Method: quicksort Memory: 1244kB
Buffers: shared hit=110
-> Seq Scan on hierarchy.static_data (cost=0.00..275.00 rows=11000 width=44) (actual time=0.007..1.827 rows=11000 loops=1)
Output: NULL::bigint, static_data."timestamp", static_data.tag, static_data.prop, static_data.value
Filter: (static_data."timestamp" < now())
Buffers: shared hit=110
Planning:
Buffers: shared hit=3
Planning Time: 0.138 ms
Execution Time: 20.725 ms
static_data create SQL:
https://pastebin.com/raw/E9u2j6ra
tag create SQL: