I am doing a query to select records that lie in a certain geographic region and I am doing some joins and couple of filtering as well.
This is my query:
SELECT "events".* FROM "events" INNER JOIN "albums" ON "albums"."event_id" = "events"."id" INNER JOIN "photos" ON "photos"."album_id" = "albums"."id" WHERE "events"."deleted_at" IS NULL AND "albums"."deleted_at" IS NULL AND "photos"."deleted_at" IS NULL AND (events.latitude BETWEEN -44.197088742316055 AND -23.22003941183816 AND events.longitude BETWEEN 133.226480859375 AND 165.570230859375) GROUP BY events.id HAVING count(albums.id) > 0 ORDER BY start_date DESC
I have the following indexes:
Events:
"events_pkey" PRIMARY KEY, btree (id)
"index_events_on_deleted_at" btree (deleted_at)
"index_events_on_latitude_and_longitude" btree (latitude, longitude)
Albums:
"albums_pkey" PRIMARY KEY, btree (id)
"index_albums_on_deleted_at" btree (deleted_at)
"index_albums_on_event_id" btree (event_id)
Photos:
"photos_pkey" PRIMARY KEY, btree (id)
"index_photos_on_album_id" btree (album_id)
"index_photos_on_deleted_at" btree (deleted_at)
Doing an EXPLAIN ANALYZE
results in this and I dont see any usage of my indexes which. I am not sure how to force it to use the indexes. Can anyone help me optimize this?
Sort (cost=4057.46..4057.84 rows=150 width=668) (actual time=556.114..556.187 rows=76 loops=1)
Sort Key: events.start_date
Sort Method: quicksort Memory: 78kB
-> HashAggregate (cost=4050.16..4052.04 rows=150 width=668) (actual time=555.667..555.783 rows=76 loops=1)
Filter: (count(albums.id) > 0)
-> Hash Join (cost=76.14..3946.54 rows=20724 width=668) (actual time=3.675..467.578 rows=48050 loops=1)
Hash Cond: (photos.album_id = albums.id)
-> Seq Scan on photos (cost=0.00..3441.87 rows=59013 width=4) (actual time=0.008..169.206 rows=60599 loops=1)
Filter: (deleted_at IS NULL)
-> Hash (cost=74.10..74.10 rows=163 width=668) (actual time=3.633..3.633 rows=318 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 176kB
-> Hash Join (cost=49.80..74.10 rows=163 width=668) (actual time=1.195..2.519 rows=318 loops=1)
Hash Cond: (albums.event_id = events.id)
-> Seq Scan on albums (cost=0.00..21.47 rows=321 width=8) (actual time=0.011..0.458 rows=321 loops=1)
Filter: (deleted_at IS NULL)
-> Hash (cost=47.92..47.92 rows=150 width=664) (actual time=1.151..1.151 rows=195 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 126kB
-> Seq Scan on events (cost=0.00..47.92 rows=150 width=664) (actual time=0.007..0.488 rows=195 loops=1)
Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
Total runtime: 556.459 ms
Thanks!!
EDIT: Thanks for the links. I have tried disabling seqscan
. Now my plan is:
Sort (cost=5565.73..5566.10 rows=150 width=46) (actual time=451.208..451.290 rows=76 loops=1)
Sort Key: (date(events.start_date))
Sort Method: quicksort Memory: 31kB
-> GroupAggregate (cost=0.00..5560.31 rows=150 width=46) (actual time=2.990..450.850 rows=76 loops=1)
Filter: (count(albums.id) > 0)
-> Nested Loop (cost=0.00..5454.44 rows=20724 width=46) (actual time=0.077..278.319 rows=48050 loops=1)
-> Merge Join (cost=0.00..205.35 rows=163 width=46) (actual time=0.051..2.856 rows=318 loops=1)
Merge Cond: (events.id = albums.event_id)
-> Index Scan using events_pkey on events (cost=0.00..118.72 rows=150 width=42) (actual time=0.024..0.792 rows=195 loops=1)
Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision)
AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
-> Index Scan using index_albums_on_event_id on albums (cost=0.00..83.83 rows=321 width=8) (actual time=0.017..0.832 rows=321 loops=1)
Filter: (deleted_at IS NULL)
-> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.010..0.409 rows=151 loops=318)
Index Cond: (album_id = albums.id)
Filter: (deleted_at IS NULL)
Total runtime: 451.562 ms
Still Indexes are not being fully used especially for the latitude and long conditions. Do I have my indexes setup correctly?
EDIT: After looking at answers at http://stackoverflow.com/questions/8228326/how-can-i-avoid-postgresql-sometimes-choosing-a-bad-query-plan-for-one-of-two-ne
, I assumed that its behaving like this because my query was returning all records, then I updated the conditions, and the new query plan is:
Sort (cost=786.18..786.22 rows=19 width=668) (actual time=3.754..3.755 rows=2 loops=1)
Sort Key: events.start_date
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=785.54..785.77 rows=19 width=668) (actual time=3.700..3.703 rows=2 loops=1)
Filter: (count(albums.id) > 0)
-> Nested Loop (cost=48.39..765.51 rows=2670 width=668) (actual time=1.116..2.968 rows=543 loops=1)
-> Hash Join (cost=48.39..89.25 rows=21 width=668) (actual time=1.093..1.128 rows=3 loops=1)
Hash Cond: (events.id = albums.event_id)
-> Bitmap Heap Scan on events (cost=9.42..49.44 rows=19 width=664) (actual time=0.061..0.080 rows=9 loops=1)
Recheck Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
Filter: (deleted_at IS NULL)
-> Bitmap Index Scan on index_events_on_latitude_and_longitude (cost=0.00..9.42 rows=28 width=0) (actual time=0.050..0.050 rows=9 loops=1)
Index Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
-> Hash (cost=34.95..34.95 rows=321 width=8) (actual time=0.992..0.992 rows=321 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 13kB
-> Bitmap Heap Scan on albums (cost=14.74..34.95 rows=321 width=8) (actual time=0.069..0.570 rows=321 loops=1)
Recheck Cond: (deleted_at IS NULL)
-> Bitmap Index Scan on index_albums_on_deleted_at (cost=0.00..14.66 rows=321 width=0) (actual time=0.056..0.056 rows=321 loops=1)
Index Cond: (deleted_at IS NULL)
-> Index Scan using index_photos_on_album_id on photos (cost=0.00..30.27 rows=155 width=4) (actual time=0.014..0.273 rows=181 loops=3)
Index Cond: (album_id = albums.id)
Filter: (deleted_at IS NULL)
Total runtime: 3.958 ms
And the time is verrry less!! Any suggestions?