0

I'm working on a new feature, and in DEV and QA environments, the performance is consistent. The execution time on the modified query is the same (or better) than the original query.

When I moved the code to production, suddenly the modified query took a serious performance hit.

I ran explain on both the original and modified query, and the difference appears to be in how a conditions for a scan is evaluated:

In my Dev, QA, and unmodified production, I get:

Index Scan using taggings_taggable_context_idx on public.taggings as pltags (cost=0.08..3.47 rows=1 width=8) (actual=0.005..0.006 rows=2 loops=6959)
Index Cond: ((pltags.taggable_id = pt.id) AND ((pltags.taggable_type)::text = 'PulseTemplate'::text) AND ((pltags.context)::text = 'product_lines'::text))

But in the Modified Production explain, I get:

Index Scan using type_context_id on public.taggings as pltags (cost=0.08..4.09 rows=1 width=8) (actual=0.011..0.17 rows=468 loops=5172)
Index Cond: (((pltags.taggable_type)::text = 'PulseTemplate'::text) AND ((pltags.context)::text = 'product_lines'::text))

Suddenly it's dropped the (pltags.taggable_id = pt.id) part of the condition, which now returns 468 rows instead of 2.

The Join i'm using is not part of the modified code

LEFT JOIN taggings pltags on pltags.taggable_type = 'PulseTemplate' AND pltags.context = 'product_lines' AND pltags.taggable_id = pt.id

The change is the And Condition here:

            INNER JOIN pulse_group_products pp on pp.id = i.front_product_id 
                AND (
                    pp.primary_target = CASE WHEN (pt.target_override <> '') IS NOT TRUE then pp.primary_target ELSE pt.target_override end
                    OR pp.secondary_target = CASE WHEN (pt.target_override <> '') IS NOT TRUE then pp.primary_target ELSE pt.target_override end
                )

The database is a rails application, so the schema has been managed only by the same rails migration files, so the schemas are as identical as I can make them.

The index that is being used is identical in all environments

CREATE INDEX type_context_id
ON public.taggings USING btree
(taggable_type COLLATE pg_catalog."default" ASC NULLS LAST, context COLLATE pg_catalog."default" ASC NULLS LAST, taggable_id ASC NULLS LAST)
TABLESPACE pg_default;

Why would the production database suddenly change the conditions for this join, and how can I force it to include the conditions I specified?

Tezyn
  • 1,314
  • 1
  • 10
  • 24
  • 2
    It's possible its statistics are out of date. [Try analyzing the table](https://confluence.atlassian.com/kb/optimize-and-improve-postgresql-performance-with-vacuum-analyze-and-reindex-885239781.html). Or that production has more data than dev or qa so it has chosen a different plan. Note that there is no change to the query itself, just the query plan. The results will remain the same. – Schwern Jan 25 '21 at 20:48
  • Also double negatives are very hard to read. `CASE WHEN (pt.target_override <> '') IS NOT TRUE then pp.primary_target ELSE pt.target_override end` is better as `CASE WHEN pt.target_override <> '' then pt.target_override ELSE pt.primary_target end`. However, consider setting it to null instead of empty string. `coalesce(pt.target_override, pt.primary_target)`. – Schwern Jan 25 '21 at 20:53
  • @Schwern thanks, analyzing the table restored the condition and brought the performance to where I expected it to be. – Tezyn Jan 25 '21 at 20:59
  • Could you show the entire query and the entire result from EXPLAIN ANALYZE? – Frank Heikens Jan 25 '21 at 20:59
  • @Schwern the `CASE WHEN (pt.target_override <> '') IS NOT TRUE` is how I was trying to handle both NULLS and empty strings. https://stackoverflow.com/a/23767625/901729 I'm still looking for a more elegant way. – Tezyn Jan 25 '21 at 21:33
  • @Tezyn Bad data becomes everyone's problem and complicates queries. Fix it at the source, normalize your data. Update your data to convert empty strings to nullls. Add a CHECK constraint to disallow empty strings. – Schwern Jan 25 '21 at 22:42

0 Answers0