I'm trying to determine if there a "low cost" optimization for the following query. We've implemented a system whereby 'tickets' earn 'points' and thus can be ranked. In order to support analytical type of queries, we store the rank of every ticket (tickets can be tied) with the ticket.
I've found that, at scale, updating this rank is very slow. I'm attempting to run the scenario below on a set of "tickets" that is about 20k tickets big.
I'm hoping that someone can help identify why and offer some help.
We're on postgres 9.3.6
Here's a simplified ticket table schema:
ogs_1=> \d api_ticket
Table "public.api_ticket"
Column | Type | Modifiers
------------------------------+--------------------------+---------------------------------------------------------
id | integer | not null default nextval('api_ticket_id_seq'::regclass)
status | character varying(3) | not null
points_earned | integer | not null
rank | integer | not null
event_id | integer | not null
user_id | integer | not null
Indexes:
"api_ticket_pkey" PRIMARY KEY, btree (id)
"api_ticket_4437cfac" btree (event_id)
"api_ticket_e8701ad4" btree (user_id)
"api_ticket_points_earned_idx" btree (points_earned)
"api_ticket_rank_idx" btree ("rank")
Foreign-key constraints:
"api_ticket_event_id_598c97289edc0e3e_fk_api_event_id" FOREIGN KEY (event_id) REFERENCES api_event(id) DEFERRABLE INITIALLY DEFERRED
(user_id) REFERENCES auth_user(id) DEFERRABLE INITIALLY DEFERRED
Here's the query that I'm executing:
UPDATE api_ticket t SET rank = (
SELECT rank
FROM (SELECT Rank() over (
Partition BY event_id ORDER BY points_earned DESC
) as rank, id
FROM api_ticket tt
WHERE event_id = t.event_id
AND tt.status != 'x'
) as r
WHERE r.id = t.id
)
WHERE event_id = <EVENT_ID> AND t.status != 'x';
Here's the explain on a set of about 10k rows:
Update on api_ticket t (cost=0.00..1852176.70 rows=9646 width=88) (actual time=1254035.623..1254035.623 rows=0 loops=1)
-> Seq Scan on api_ticket t (cost=0.00..1852176.70 rows=9646 width=88) (actual time=121.611..1253148.416 rows=9748 loops=1)
Filter: (((status)::text <> 'x'::text) AND (event_id = 207))
Rows Removed by Filter: 10
SubPlan 1
-> Subquery Scan on r (cost=159.78..191.97 rows=1 width=8) (actual time=87.466..128.537 rows=1 loops=9748)
Filter: (r.id = t.id)
Rows Removed by Filter: 9747
-> WindowAgg (cost=159.78..178.55 rows=1073 width=12) (actual time=46.389..108.954 rows=9748 loops=9748)
-> Sort (cost=159.78..162.46 rows=1073 width=12) (actual time=46.370..66.163 rows=9748 loops=9748)
Sort Key: tt.points_earned
Sort Method: quicksort Memory: 799kB
-> Index Scan using api_ticket_4437cfac on api_ticket tt (cost=0.29..105.77 rows=1073 width=12) (actual time=2.698..26.448 rows=9748 loops=9748)
Index Cond: (event_id = t.event_id)
Filter: ((status)::text <> 'x'::text)
Total runtime: 1254036.583 ms