Suppose I have a given time range. For explanation, let's consider something simple, like whole year 2018. I want to query data from ClickHouse as a sum aggregation for each quarter so the result should be 4 rows.
The problem is that I have data for only two quarters so when using GROUP BY quarter
, only two rows are returned.
SELECT
toStartOfQuarter(created_at) AS time,
sum(metric) metric
FROM mytable
WHERE
created_at >= toDate(1514761200) AND created_at >= toDateTime(1514761200)
AND
created_at <= toDate(1546210800) AND created_at <= toDateTime(1546210800)
GROUP BY time
ORDER BY time
1514761200
– 2018-01-01
1546210800
– 2018-12-31
This returns:
time metric
2018-01-01 345
2018-04-01 123
And I need:
time metric
2018-01-01 345
2018-04-01 123
2018-07-01 0
2018-10-01 0
This is simplified example but in real use case the aggregation would be eg. 5 minutes instead of quarters and GROUP BY would have at least one more attribute like GROUP BY attribute1, time
so desired result is
time metric attribute1
2018-01-01 345 1
2018-01-01 345 2
2018-04-01 123 1
2018-04-01 123 2
2018-07-01 0 1
2018-07-01 0 2
2018-10-01 0 1
2018-10-01 0 2
Is there a way to somehow fill the whole given interval? Like InfluxDB has fill
argument for group or TimescaleDb's time_bucket()
function with generate_series()
I tried to search ClickHouse documentation and github issues and it seems this is not implemented yet so the question perhaps is whether there's any workaround.