I've got a query with an ORDER and a LIMIT to support a paginated interface:
SELECT segment_members.id AS t0_r0,
segment_members.segment_id AS t0_r1,
segment_members.account_id AS t0_r2,
segment_members.score AS t0_r3,
segment_members.created_at AS t0_r4,
segment_members.updated_at AS t0_r5,
segment_members.posts_count AS t0_r6,
accounts.id AS t1_r0,
accounts.platform AS t1_r1,
accounts.username AS t1_r2,
accounts.created_at AS t1_r3,
accounts.updated_at AS t1_r4,
accounts.remote_id AS t1_r5,
accounts.name AS t1_r6,
accounts.language AS t1_r7,
accounts.description AS t1_r8,
accounts.timezone AS t1_r9,
accounts.profile_image_url AS t1_r10,
accounts.post_count AS t1_r11,
accounts.follower_count AS t1_r12,
accounts.following_count AS t1_r13,
accounts.uri AS t1_r14,
accounts.location AS t1_r15,
accounts.favorite_count AS t1_r16,
accounts.raw AS t1_r17,
accounts.followers_completed_at AS t1_r18,
accounts.followings_completed_at AS t1_r19,
accounts.followers_started_at AS t1_r20,
accounts.followings_started_at AS t1_r21,
accounts.profile_fetched_at AS t1_r22,
accounts.managed_source_id AS t1_r23
FROM segment_members
INNER JOIN accounts ON accounts.id = segment_members.account_id
WHERE segment_members.segment_id = 1
ORDER BY accounts.follower_count ASC LIMIT 20
OFFSET 0;
Here are the indexes on the tables:
accounts
"accounts_pkey" PRIMARY KEY, btree (id)
"index_accounts_on_remote_id_and_platform" UNIQUE, btree (remote_id, platform)
"index_accounts_on_description" btree (description)
"index_accounts_on_favorite_count" btree (favorite_count)
"index_accounts_on_follower_count" btree (follower_count)
"index_accounts_on_following_count" btree (following_count)
"index_accounts_on_lower_username_and_platform" btree (lower(username::text), platform)
"index_accounts_on_post_count" btree (post_count)
"index_accounts_on_profile_fetched_at_and_platform" btree (profile_fetched_at, platform)
"index_accounts_on_username" btree (username)
segment_members
"segment_members_pkey" PRIMARY KEY, btree (id)
"index_segment_members_on_segment_id_and_account_id" UNIQUE, btree (segment_id, account_id)
"index_segment_members_on_account_id" btree (account_id)
"index_segment_members_on_segment_id" btree (segment_id)
In my development and staging databases, the query plan looks like the following, and the query executes very quickly.
Limit (cost=4802.15..4802.20 rows=20 width=2086)
-> Sort (cost=4802.15..4803.20 rows=421 width=2086)
Sort Key: accounts.follower_count
-> Nested Loop (cost=20.12..4790.95 rows=421 width=2086)
-> Bitmap Heap Scan on segment_members (cost=19.69..1244.24 rows=421 width=38)
Recheck Cond: (segment_id = 1)
-> Bitmap Index Scan on index_segment_members_on_segment_id_and_account_id (cost=0.00..19.58 rows=
421 width=0)
Index Cond: (segment_id = 1)
-> Index Scan using accounts_pkey on accounts (cost=0.43..8.41 rows=1 width=2048)
Index Cond: (id = segment_members.account_id)
In production, however, the query plan is the following, and the query takes forever (several minutes until it hits the statement timeout).
Limit (cost=0.86..25120.72 rows=20 width=2130)
-> Nested Loop (cost=0.86..4614518.64 rows=3674 width=2130)
-> Index Scan using index_accounts_on_follower_count on accounts (cost=0.43..2779897.53 rows=3434917 width=209
2)
-> Index Scan using index_segment_members_on_segment_id_and_account_id on segment_members (cost=0.43..0.52 row
s=1 width=38)
Index Cond: ((segment_id = 1) AND (account_id = accounts.id))
accounts
has about 6m rows in staging and 3m in production. segment_members
has about 300k rows in staging and 4m in production. Is it the differences in table sizes that is causing the differences in the query plan selection? Is there any way I can get Postgres to use the faster query plan in production?
Update: Here's the EXPLAIN ANALYZE from the slow production server:
Limit (cost=0.86..22525.66 rows=20 width=2127) (actual time=173.148..187568.247 rows=20 loops=1)
-> Nested Loop (cost=0.86..4654749.92 rows=4133 width=2127) (actual time=173.141..187568.193 rows=20 loops=1)
-> Index Scan using index_accounts_on_follower_count on accounts (cost=0.43..2839731.81 rows=3390197 width=2089) (actual time=0.110..180374.279 rows=1401278 loops=1)
-> Index Scan using index_segment_members_on_segment_id_and_account_id on segment_members (cost=0.43..0.53 rows=1 width=38) (actual time=0.003..0.003 rows=0 loops=1401278)
Index Cond: ((segment_id = 1) AND (account_id = accounts.id))
Total runtime: 187568.318 ms
(6 rows)