0

I am doing a query to select records that lie in a certain geographic region and I am doing some joins and couple of filtering as well.

This is my query:

SELECT "events".* FROM "events" INNER JOIN "albums" ON "albums"."event_id" = "events"."id" INNER JOIN "photos" ON "photos"."album_id" = "albums"."id" WHERE "events"."deleted_at" IS NULL AND "albums"."deleted_at" IS NULL AND "photos"."deleted_at" IS NULL AND (events.latitude BETWEEN -44.197088742316055 AND -23.22003941183816 AND events.longitude BETWEEN 133.226480859375 AND 165.570230859375) GROUP BY events.id HAVING count(albums.id) > 0  ORDER BY start_date DESC

I have the following indexes:

Events:

"events_pkey" PRIMARY KEY, btree (id)
"index_events_on_deleted_at" btree (deleted_at)
"index_events_on_latitude_and_longitude" btree (latitude, longitude)

Albums:

"albums_pkey" PRIMARY KEY, btree (id)
"index_albums_on_deleted_at" btree (deleted_at)
"index_albums_on_event_id" btree (event_id)

Photos:

"photos_pkey" PRIMARY KEY, btree (id)
"index_photos_on_album_id" btree (album_id)
"index_photos_on_deleted_at" btree (deleted_at)

Doing an EXPLAIN ANALYZE results in this and I dont see any usage of my indexes which. I am not sure how to force it to use the indexes. Can anyone help me optimize this?

 Sort  (cost=4057.46..4057.84 rows=150 width=668) (actual time=556.114..556.187 rows=76 loops=1)
   Sort Key: events.start_date
   Sort Method: quicksort  Memory: 78kB
   ->  HashAggregate  (cost=4050.16..4052.04 rows=150 width=668) (actual time=555.667..555.783 rows=76 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Hash Join  (cost=76.14..3946.54 rows=20724 width=668) (actual time=3.675..467.578 rows=48050 loops=1)
               Hash Cond: (photos.album_id = albums.id)
               ->  Seq Scan on photos  (cost=0.00..3441.87 rows=59013 width=4) (actual time=0.008..169.206 rows=60599 loops=1)
                     Filter: (deleted_at IS NULL)
               ->  Hash  (cost=74.10..74.10 rows=163 width=668) (actual time=3.633..3.633 rows=318 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 176kB
                     ->  Hash Join  (cost=49.80..74.10 rows=163 width=668) (actual time=1.195..2.519 rows=318 loops=1)
                           Hash Cond: (albums.event_id = events.id)
                           ->  Seq Scan on albums  (cost=0.00..21.47 rows=321 width=8) (actual time=0.011..0.458 rows=321 loops=1)
                                 Filter: (deleted_at IS NULL)
                           ->  Hash  (cost=47.92..47.92 rows=150 width=664) (actual time=1.151..1.151 rows=195 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 126kB
                                 ->  Seq Scan on events  (cost=0.00..47.92 rows=150 width=664) (actual time=0.007..0.488 rows=195 loops=1)
                                       Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
 Total runtime: 556.459 ms

Thanks!!

EDIT: Thanks for the links. I have tried disabling seqscan. Now my plan is:

 Sort  (cost=5565.73..5566.10 rows=150 width=46) (actual time=451.208..451.290 rows=76 loops=1)
   Sort Key: (date(events.start_date))
   Sort Method: quicksort  Memory: 31kB
   ->  GroupAggregate  (cost=0.00..5560.31 rows=150 width=46) (actual time=2.990..450.850 rows=76 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Nested Loop  (cost=0.00..5454.44 rows=20724 width=46) (actual time=0.077..278.319 rows=48050 loops=1)
               ->  Merge Join  (cost=0.00..205.35 rows=163 width=46) (actual time=0.051..2.856 rows=318 loops=1)
                     Merge Cond: (events.id = albums.event_id)
                     ->  Index Scan using events_pkey on events  (cost=0.00..118.72 rows=150 width=42) (actual time=0.024..0.792 rows=195 loops=1)
                           Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision)
 AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
                     ->  Index Scan using index_albums_on_event_id on albums  (cost=0.00..83.83 rows=321 width=8) (actual time=0.017..0.832 rows=321 loops=1)
                           Filter: (deleted_at IS NULL)
               ->  Index Scan using index_photos_on_album_id on photos  (cost=0.00..30.27 rows=155 width=4) (actual time=0.010..0.409 rows=151 loops=318)
                     Index Cond: (album_id = albums.id)
                     Filter: (deleted_at IS NULL)
 Total runtime: 451.562 ms

Still Indexes are not being fully used especially for the latitude and long conditions. Do I have my indexes setup correctly?

EDIT: After looking at answers at http://stackoverflow.com/questions/8228326/how-can-i-avoid-postgresql-sometimes-choosing-a-bad-query-plan-for-one-of-two-ne, I assumed that its behaving like this because my query was returning all records, then I updated the conditions, and the new query plan is:

 Sort  (cost=786.18..786.22 rows=19 width=668) (actual time=3.754..3.755 rows=2 loops=1)
   Sort Key: events.start_date
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=785.54..785.77 rows=19 width=668) (actual time=3.700..3.703 rows=2 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Nested Loop  (cost=48.39..765.51 rows=2670 width=668) (actual time=1.116..2.968 rows=543 loops=1)
               ->  Hash Join  (cost=48.39..89.25 rows=21 width=668) (actual time=1.093..1.128 rows=3 loops=1)
                     Hash Cond: (events.id = albums.event_id)
                     ->  Bitmap Heap Scan on events  (cost=9.42..49.44 rows=19 width=664) (actual time=0.061..0.080 rows=9 loops=1)
                           Recheck Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
                           Filter: (deleted_at IS NULL)
                           ->  Bitmap Index Scan on index_events_on_latitude_and_longitude  (cost=0.00..9.42 rows=28 width=0) (actual time=0.050..0.050 rows=9 loops=1)
                                 Index Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
                     ->  Hash  (cost=34.95..34.95 rows=321 width=8) (actual time=0.992..0.992 rows=321 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 13kB
                           ->  Bitmap Heap Scan on albums  (cost=14.74..34.95 rows=321 width=8) (actual time=0.069..0.570 rows=321 loops=1)
                                 Recheck Cond: (deleted_at IS NULL)
                                 ->  Bitmap Index Scan on index_albums_on_deleted_at  (cost=0.00..14.66 rows=321 width=0) (actual time=0.056..0.056 rows=321 loops=1)
                                       Index Cond: (deleted_at IS NULL)
               ->  Index Scan using index_photos_on_album_id on photos  (cost=0.00..30.27 rows=155 width=4) (actual time=0.014..0.273 rows=181 loops=3)
                     Index Cond: (album_id = albums.id)
                     Filter: (deleted_at IS NULL)
 Total runtime: 3.958 ms

And the time is verrry less!! Any suggestions?

Steve Robinson
  • 3,759
  • 3
  • 36
  • 57

1 Answers1

1

This is mostly because you did not set a LIMIT clause on your query. Without LIMIT you always requests all data from your tables, so looking at their indexes too would be insufficient.

SQLFiddle 1, 2 vs. 3, 4

Also note, that FOREIGN KEY does not add indexes (but UNIQUE & PRIMARY KEY constraint does). So you might want to add indexes for albums.event_id & photos.album_id.

SQLFiddle 3 vs. 4

Sorting can use an index, if present. At your query this means an index on events.start_date.

pozs
  • 34,608
  • 5
  • 57
  • 63
  • Thanks a lot @pozs. Unfortunately I cannot do `LIMIT` at the moment in my app as I need to do a lot of front end changes. but it did help!! :) – Steve Robinson Jul 07 '14 at 12:19
  • And also, I already have indexes on those two `FK`s. Btw, let me add index for start_date too! Helped me a lot. Thanks! – Steve Robinson Jul 07 '14 at 12:20
  • adding `LIMIT` is not helping. I added index for sorting too. And I removed sorting too.. yet the time stays at 450-500ms. – Steve Robinson Jul 07 '14 at 13:22
  • Hi @pozs I have been using that needless `JOIN` with `photos` table!! :) I realized that I dont need to do it. Removing that drastically reduced time to around 4ms!! – Steve Robinson Jul 07 '14 at 14:12