I have following Postgres query that runs over a large table and whenever it does, I observe a high CPU load that reaches 100% :
SELECT id, user_id, type
FROM table ss
WHERE end_timestamp > CURRENT_TIMESTAMP
AND (notification_channel1 = true OR notification_channel2 = true)
AND last_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'
ORDER BY user_id, type, title
Table schema:
CREATE TABLE saved_search (
id BIGINT PRIMARY KEY,
user_id varchar(40) NOT NULL,
title varchar(500) NOT NULL,
start_timestamp timestamp with time zone NOT NULL,
end_timestamp timestamp with time zone NOT NULL,
last_notification_timestamp timestamp with time zone NOT NULL,
notification_channel1 boolean NOT NULL DEFAULT TRUE,
notification_channel2 boolean NOT NULL DEFAULT TRUE,
);
I am thinking of using a covering index to speed up this query and hopefully avoid the cpu spike.
First question: would that be a valid path to follow?
The index I'm thinking of would be something like:
CREATE INDEX IX_COVERING
ON table (end_date, notification_channel1, notification_channel2, last_notification_timestamp)
INCLUDE (id, user_id, type)
Would that be useful ? Would the INCLUDE be needed actually? Should I change the order of the columns in the index? Are there other / better approaches?
Gather Merge (cost=387647.26..737625.73 rows=2999606 width=40) (actual time=36085.232..47805.119 rows=3634052 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=386647.24..390396.74 rows=1499803 width=40) (actual time=35820.825..40497.683 rows=1211351 loops=3)
" Sort Key: user_id, type, title"
Sort Method: external merge Disk: 63640kB
Worker 0: Sort Method: external merge Disk: 63944kB
Worker 1: Sort Method: external merge Disk: 57896kB
-> Parallel Seq Scan on table (cost=0.00..150768.88 rows=1499803 width=40) (actual time=0.200..4176.269 rows=1211351 loops=3)
Filter: ((notification_channel1 OR notification_channel2) AND (end_timestamp > CURRENT_TIMESTAMP) AND (last_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'))
Rows Removed by Filter: 136960
Planning Time: 3.632 ms
Execution Time: 48292.788 ms
With SET work_mem = '100MB';
, here is the output I'm getting:
`Gather Merge (cost=305621.26..655599.73 rows=2999606 width=40) (actual time=48856.376..55606.264 rows=3634097 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=304621.24..308370.74 rows=1499803 width=40) (actual time=46436.173..47563.732 rows=1211366 loops=3)
" Sort Key: user_id, type, title"
Sort Method: external merge Disk: 72232kB
Worker 0: Sort Method: external merge Disk: 55288kB
Worker 1: Sort Method: external merge Disk: 57816kB
-> Parallel Seq Scan on table (cost=0.00..150768.88 rows=1499803 width=40) (actual time=0.911..4643.228 rows=1211366 loops=3)
Filter: ((notification_channel1 OR notification_channel2) AND (end_timestamp > CURRENT_TIMESTAMP) AND ((ast_notification_timestamp < CURRENT_TIMESTAMP - interval '1 days'))
Rows Removed by Filter: 136960
Planning Time: 0.450 ms
Execution Time: 56035.995 ms