What could be the problem with a Continuous Aggregate from multiple TimescaleDB Hypertables?

Question

I've got a couple time series data tables in my Postgres database, which I recently converted to TimescaleDB hypertables. I've got a massive materialized view, but refreshing that takes ages. For that reason I wanted to use TimescaleDB Continuous Aggregates. Unfortunately that gives an error saying:

ERROR: only 1 hypertable is permitted in SELECT query for continuous aggregate

So I was planning on building my own continuous aggregate by simply creating a new table and calculate/append the new records every 10 min or so. A colleague suggested that maybe the timescale guys had good reasons for not permitting a continuous aggregate from multiple hypertables, but I just don't see what it could be.

I created an issue about it on their github page, but for now I haven't gotten any response. So before I start building my own Continuous Aggregate, I thought I'd ask you smart people from stackoverflow.

What could be a reason for the people of TimescaleDB to not allow Continuous Aggregates from multiple hypertables? What challenges would you see in building something like that yourself?

score 3 · Answer 1 · answered Jun 14 '21 at 12:43

3

The idea of continuous aggregates is to refresh only current window on hot data and don't touch old data. However, it is not always possible to calculate a current increment to continuous aggregate materialisation in the case of a view query with joins. If the join will require to check old data, it will kill performance.

Saying this, there are certain use cases where it should be possible to support limited join queries, where the limitations might affect, which tables can be joined and what should be join conditions and filters.

I suggest to describe your case in an issue to TimescaleDB, so it can be considered when the feature will be designed.

answered Jun 14 '21 at 12:43

k_rus

2,959
1
19
31

I was thinking along the same lines; if you join two tables on a field which requires all data to be joined you indeed loose all the performance benefits of TimescaleDB. However, if you select/join both tables with an equal time-frame then you can still get the performance benefits. – kramer65 Jun 16 '21 at 11:23
@kramer65 Right, if the join limited by time frame, e.g., within time bucket, it is expected to work efficiecntly. Implementing in TimescaleDB will, obv, require to design and implement the support with the limitation, so it is usable for users and works correctly for all possible databases and view queries. So it is question of priority for Timescale team, thus I suggested to put more effort in TimescaleDB issue tracker. – k_rus Jun 16 '21 at 13:31
I'll add a feature request. I'm assuming more people would benefit from this. Thanks for your answers! – kramer65 Jun 16 '21 at 13:47

What could be the problem with a Continuous Aggregate from multiple TimescaleDB Hypertables?

1 Answers1