A materialized view is not going to help, since your query needs to be based on the latest data and you would have to refresh the MV just as often as you query.
Trigger-based solutions would be another option: keep an auxiliary table up to date with current counts per page. But I would expect the additional cost on (your many) write operations to be much higher than the gain for the read operation. So I would rule this out, too.
While you operate with one big table I suggest a partial index:
CREATE INDEX foo ON user_is_viewing_page (page_id, timestamp)
WHERE timestamp > '2014-12-29 23:30:00'::timestamp; -- start with 'now'
Query (mostly what you had already):
SELECT COUNT(DISTINCT user_id)
FROM user_is_viewing_page
WHERE page_id = 1
AND timestamp > LOCALTIMESTAMP - INTERVAL '10 sec';
CURRENT_TIMESTAMP
would work, too. But LOCALTIMESTAMP
makes more sense for your setup. Per documentation:
CURRENT_TIME
and CURRENT_TIMESTAMP
deliver values with time zone;
LOCALTIME
and LOCALTIMESTAMP
deliver values without time zone.
The look-up on the partial index itself has basically the same cost as using a full index. But since your table is supposedly big, the partial index should be much smaller than a full index, which will much rather fit and stay in RAM and be generally faster. If you have more than enough RAM, compare performance with a simple, big full index without WHERE
condition.
The advantage of the partial index obviously deteriorates over time. Create a new index with an updated timestamp in the WHERE
condition at intervals of your choosing and drop the old index after that. Queries will start the new (smaller) index immediately, so the old one can easily be dropped. Possible ways to automate are outlined in these related answers with more explanation:
You may need to add the exact WHERE
condition of the index to your queries (though seemingly redundant) to convince the query planner it's safe to use the partial index. Especially with prepared statements (including all statements in plpgsql functions) where the actual timestamp to compare to is parametrized, else Postgres cannot use the partial index for a generic query plan.
In the example above you would add the WHERE
condition to the query:
AND timestamp > '2014-12-29 23:30:00'::timestamp -- matches index condition exactly
A more general solution for this can be found in the linked answer above.
Aside: I wouldn't use "timestamp" as identifier because it's a basic type name.