Why does a slight change in the search term slow down the query so much?

Question

I have the following query in PostgreSQL (9.5.1):

select e.id, (select count(id) from imgitem ii where ii.tabid = e.id and ii.tab = 'esp') as imgs,
 e.ano, e.mes, e.dia, cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data,
 pl.pltag, e.inpa, e.det, d.ano anodet, coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')' determinador, d.tax, coalesce(v.val,v.valf)||' '||vu.unit as altura,
 coalesce(v1.val,v1.valf)||' '||vu1.unit as DAP, d.fam, tf.nome família, d.gen, tg.nome gênero, d.sp, ts.nome espécie, d.inf, e.loc, l.nome localidade, e.lat, e.lon
from esp e
left join det d on e.det = d.id
left join tax tf on d.fam = tf.oldfam
left join tax tg on d.gen = tg.oldgen
left join tax ts on d.sp = ts.oldsp
left join tax ti on d.inf = ti.oldinf
left join loc l on e.loc = l.id
left join pess p on p.id = d.detby
left join var v on v.esp = e.id and v.key = 265
left join varunit vu on vu.id = v.unit
left join var v1 on v1.esp = e.id and v1.key = 264
left join varunit vu1 on vu1.id = v1.unit
left join pl on pl.id = e.pl
WHERE unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')

It takes 430ms to retrieve 1129 rows from a total of 9250 in esp table.

If I change the search term from %vicen% to %vicent% (adding a 't'), it takes 431ms to retrieve the same 1129 rows.

Ordering by the search column, ascending and descending, I see that all 1129 rows have exactly the same name in both cases.

Now the strange: if I change the search term from %vicent% to %vicenti% (adding an 'i'), now it takes unbelievable 24.4 seconds to retrieve the same 1129 rows!

The searched term is always in the first coalesce, i.e. coalesce(p.abrev,''). I expect the query to run slower or faster, depending on the size of the searched string, but not that much!! Anyone has any idea of what's going on?

Results of EXPLAIN ANALYZE (would exceed the 30k character limit here):

For %vicen%: http://explain.depesz.com/s/2XF

For %vicenti%: http://explain.depesz.com/s/dEc6

It must be trying to use some other index at that point. Use EXPLAIN to show you execution plan for each. — user1751825, Mar 31 '16 at 21:18
Please [edit] your question and add the execution plans for both queries using `explain (analyze, verbose)` — , Mar 31 '16 at 21:30
@user1751825 When using EXPLAIN they all go very fast, but I'll try to understand the execution plan later. Thank you. — Rodrigo, Mar 31 '16 at 21:38
@a_horse_with_no_name Unfortunately, there's no room for both execution plans. I removed the beginning of the first one. Thank you! — Rodrigo, Mar 31 '16 at 21:47
You *must* provide your version of Postgres for such a question. There have been major improvements to GIN indexes in Postgres 9.4 and 9.5. Better move your huge `EXPLAIN` output to http://explain.depesz.com/ and post a link. Why the odd predicate `WHERE unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')`? Do you have a matching index? Can you remove some of the (unrelated) `LEFT JOIN`s and still reproduce the effect? If so, please trim the clutter and provide an [SSCE](http://sscce.org/). — Erwin Brandstetter, Apr 01 '16 at 01:20
Also relevant: What do you get for `SHOW join_collapse_limit;`? — Erwin Brandstetter, Apr 01 '16 at 02:40
@ErwinBrandstetter. No, I hadn't used any indexes yet. The odd predicate was just a repeat of the select field in the where clause, since we can't use the alias. I'll post the new version here, after I grasp all your suggested improvements. join_collapse_limit is 8. — Rodrigo, Apr 01 '16 at 14:45
What @ErwinBrandstetter says: probably the range table is beyond the collapse limit (12 or 14) and geqo kicks is. (workaround: use a CTE to squeeze out the trivial subquery, sucj as a lookup. Or: increase the limit) — joop, Apr 01 '16 at 15:10
BTW: you are not using any of the fields from **ti** `left join tax ti on d.inf = ti.oldinf` — joop, Apr 01 '16 at 15:39

Erwin Brandstetter · Accepted Answer · 2023-08-01T20:21:37.527

Why?

Fast query:

->  Hash Left Join  (cost=1378.60..2467.48 rows=15 width=79) (actual time=41.759..85.037 rows=1129 loops=1)
      ...
      Filter: (unaccent(((((COALESCE(p.abrev, ''::character varying))::text || ' ('::text) || (COALESCE(p.prenome, ''::character varying))::text) || ')'::text)) ~~* (...)

Slow query:

->  Hash Left Join  (cost=1378.60..2467.48 rows=1 width=79) (actual time=35.084..80.209 rows=1129 loops=1)
      ...
      Filter: (unaccent(((((COALESCE(p.abrev, ''::character varying))::text || ' ('::text) || (COALESCE(p.prenome, ''::character varying))::text) || ')'::text)) ~~* unacc (...)

Extending the search pattern by another character causes Postgres to assume yet fewer hits. (Typically, this is a reasonable estimate.) Postgres obviously does not have precise enough statistics (none, actually, see below) to expect the same number of hits that you really get.

This causes a switch to a different query plan, which is even less optimal for the actual number of hits rows=1129.

Solution

One way to improve the situation is to create an expression index on the expression in the predicate. This makes Postgres gather statistics for the actual expression, which can help the query even if the index itself is not used for the query. Without the index, there are no statistics for the expression at all. And if done right the index can be used for the query, that's even much better. But there are multiple problems with your current expression:

~~unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')~~

Consider this updated query, based on some assumptions about your undisclosed table definitions:

SELECT e.id
     , (SELECT count(*) FROM imgitem
        WHERE tabid = e.id AND tab = 'esp') AS imgs -- count(*) is faster
     , e.ano, e.mes, e.dia
     , e.ano::text || to_char(e.mes2, 'FM"-"00')
                   || to_char(e.dia,  'FM"-"00') AS data    
     , pl.pltag, e.inpa, e.det, d.ano anodet
     , format('%s (%s)', p.abrev, p.prenome) AS determinador
     , d.tax
     , coalesce(v.val,v.valf)   || ' ' || vu.unit  AS altura
     , coalesce(v1.val,v1.valf) || ' ' || vu1.unit AS dap
     , d.fam, tf.nome família, d.gen, tg.nome AS gênero, d.sp
     , ts.nome AS espécie, d.inf, e.loc, l.nome localidade, e.lat, e.lon
FROM      pess    p                        -- reorder!
JOIN      det     d   ON d.detby   = p.id  -- INNER JOIN !
LEFT JOIN tax     tf  ON tf.oldfam = d.fam
LEFT JOIN tax     tg  ON tg.oldgen = d.gen
LEFT JOIN tax     ts  ON ts.oldsp  = d.sp
LEFT JOIN tax     ti  ON ti.oldinf = d.inf  -- unused, see @joop's comment
LEFT JOIN esp     e   ON e.det     = d.id
LEFT JOIN loc     l   ON l.id      = e.loc
LEFT JOIN var     v   ON v.esp     = e.id AND v.key  = 265
LEFT JOIN varunit vu  ON vu.id     = v.unit
LEFT JOIN var     v1  ON v1.esp    = e.id AND v1.key = 264
LEFT JOIN varunit vu1 ON vu1.id    = v1.unit
LEFT JOIN pl          ON pl.id     = e.pl
WHERE f_unaccent(p.abrev)   ILIKE f_unaccent('%' || 'vicenti' || '%') OR
      f_unaccent(p.prenome) ILIKE f_unaccent('%' || 'vicenti' || '%');

Major points

Why f_unaccent()? Because unaccent() can't be indexed. Read here:

Does PostgreSQL support "accent insensitive" collations?

I used the function outlined there to allow the following (recommended!) multicolumn functional trigram GIN index:

CREATE INDEX pess_unaccent_nome_trgm_idx ON pess
USING gin (f_unaccent(pess) gin_trgm_ops, f_unaccent(prenome) gin_trgm_ops);

If you are not familiar with trigram indexes, read this first:

PostgreSQL LIKE query performance variations

And possibly:

Pattern matching with LIKE, SIMILAR TO or regular expressions

Be sure to run the latest version of Postgres (9.5 at the time of writing). There have been substantial improvements to GIN indexes. And you'll be interested in improvements in pg_trgm 1.2, scheduled to be released with the upcoming Postgres 9.6:

Trigram search gets much slower as search string gets longer

Prepared statements are a common way to execute queries with parameters (especially with text from user input). Postgres has to find a plan that works best for any given parameter. Add wildcards as constants to the to the search term like this:

f_unaccent(p.abrev) ILIKE f_unaccent('%' || 'vicenti' || '%')

('vicenti' would be replaced with a parameter.) So Postgres knows we are dealing with a pattern that is neither anchored left nor right - which would allow different strategies. Related answer with more details:

Performance impact of empty LIKE in a prepared statement

Or maybe re-plan the query for every search term (possibly using dynamic SQL in a function). But make sure planning time isn't eating any possible performance gain.

The WHERE condition on columns in pess contradicts the ~~LEFT JOIN~~. Postgres is forced to convert that to an INNER JOIN. Unfortunately, the join comes late in the join tree. And since Postgres cannot reorder your joins (see below), that can become very expensive. Move the table to the first position in the FROM clause to eliminate rows early. Following LEFT JOINs do not eliminate any rows by definition. But with that many tables it is important to move joins that might multiply rows to the end.

You are joining 13 tables, 12 of them with LEFT JOIN which leaves 12! possible combinations - or 11! * 2! if we take the one LEFT JOIN into account that's really an INNER JOIN. That's too many for Postgres to evaluate all possible permutations for the best query plan. Read about join_collapse_limit:

The default setting for join_collapse_limit is 8, which means that Postgres won't try to reorder tables in your FROM clause and the order of tables becomes relevant.

One way to work around this would be to split the performance-critical part into a CTE like @joop commented. Don't set join_collapse_limit much higher or times for query planning involving many joined tables will deteriorate.

About your concatenated date named data:

~~cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data~~

Assuming you build from three numeric columns for year, month and day, which are defined NOT NULL, use this instead:

e.ano::text || to_char(e.mes2, 'FM"-"00')
            || to_char(e.dia,  'FM"-"00') AS data

About the FM template pattern modifier:

Check for integer in string array

But really, you should store the date as data type date to begin with.

Also simplified:

format('%s (%s)', p.abrev, p.prenome) AS determinador

Won't make the query faster, but it's much cleaner. See format().

First things last, all the usual advice for performance optimization applies:

Keep PostgreSQL from sometimes choosing a bad query plan

If you get all of this right, you should see much faster queries for all patterns.

Thank you very much for your very detailed answer, Erwin! I'll answer each point as I go through. Have a great weekend! — Rodrigo, Apr 01 '16 at 12:38
Why f_unaccent()? Because in Brazil accents are very common in people's names: João, José, Luís, Mário... I'm looking at your strategy. — Rodrigo, Apr 01 '16 at 12:40
@Rodrigo: I think you misunderstood concerning `f_unaccent()`. `unaccent()` is obviously useful. I am just explaining *why* I use my custom function `f_unaccent()` instead of `unaccent()` out of the box. [Follow the link above](http://stackoverflow.com/questions/11005036/does-postgresql-support-accent-insensitive-collations/11007216#11007216) for details. — Erwin Brandstetter, Apr 01 '16 at 14:14
Really, I misunderstood it. Your single answer gave me material for study for several days hehehe thanks again! — Rodrigo, Apr 01 '16 at 14:40
The date is spread into three columns because sometimes we only have the year, or year+month, of a given record. There are some alternatives here: http://stackoverflow.com/questions/21599997/iso-8601-in-postgres-how-to-insert-only-year-in-type-date-incomplete-date-tim -- but I'm not sure yet which is the best. — Rodrigo, Apr 01 '16 at 15:01
What's the point in using f_unaccent('%' || 'vicenti' || '%') instead of f_unaccent('%vicenti%')? — Rodrigo, Apr 01 '16 at 15:05
@Rodrigo: I added a bit to address concatenated wildcards. About date storage: ask a new question if you don't find a solution (this answer is getting big already). You can always link to this one for context. — Erwin Brandstetter, Apr 01 '16 at 15:26
For a long time I hadn't work on that. Now I remembered an important thing: this query is created on run-time. The WHERE in the end depends on which column the user has filtered (may involve many columns). So I think the order change (pess instead of esp) may not be practical. Still studying the query, though. — Rodrigo, Jul 19 '16 at 21:08

score 2 · Answer 2 · answered Apr 01 '16 at 16:18

A way to reduce the size of the range table is to squeeze out a trivial part of the query into a CTE, such AS:

WITH zzz AS (
        SELECT l.id, l.nome
        , coalesce(v.val,v.valf)||' '||vu.unit as altura
        , coalesce(v1.val,v1.valf)||' '||vu1.unit as DAP
        FROM loc l 
         left join var v on v.esp = l.id and v.key = 265
         left join varunit vu on vu.id = v.unit
         left join var v1 on v1.esp = l.id and v1.key = 264
         left join varunit vu1 on vu1.id = v1.unit
        )
select e.id, (select count(id) from imgitem ii
                where ii.tabid = e.id and ii.tab = 'esp'
                ) as imgs
        , e.ano, e.mes, e.dia
        , cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data
        , pl.pltag, e.inpa, e.det, d.ano anodet
        , coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')' determinador
        , d.tax

        , zzz.altura as altura
        , zzz.DAP as DAP

        , d.fam, tf.nome família
        , d.gen, tg.nome gênero
        , d.sp , ts.nome espécie
        , d.inf, e.loc
        , zzz.nome AS localidade
        , e.lat, e.lon
from esp e
left join det d on e.det = d.id         -- these could possibly be
left join pess p on p.id = d.detby      -- plain joins
        -- 
left join tax tf on d.fam = tf.oldfam
left join tax tg on d.gen = tg.oldgen
left join tax ts on d.sp = ts.oldsp
 -- ### commented out, since it is never referred
 -- ### left join tax ti on d.inf = ti.oldinf
left join pl on pl.id = e.pl
left JOIN zzz ON zzz.id = e.loc
        -- 
WHERE unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')
        ;

[untested, since I don't have the table definitions]

Why does a slight change in the search term slow down the query so much?

2 Answers2

Why?

Solution

Major points

Linked