0

I have a problem where adding a simple test condition (which should be evaluated only once per query) results in the whole query taking ten times as long to return. I'm running PostgreSQL 9.2, and the tables in play are as follows:

CREATE TABLE tags (
    tid     int4 UNIQUE NOT NULL,
    name    text UNIQUE NOT NULL,
    PRIMARY KEY (tid));

CREATE TABLE bugs (
    bid     int4 UNIQUE NOT NULL,
    type    int2 NOT NULL,
    title   text NOT NULL,
    PRIMARY KEY (bid));

CREATE TABLE bug_tags (
    bid     int4 REFERENCES bugs(bid) NOT NULL,
    tid     int4 REFERENCES tags(tid) NOT NULL,
    PRIMARY KEY (bid, tid));

CREATE INDEX bug_tags_bid_idx ON bug_tags (bid);
CREATE INDEX bug_tags_tid_idx ON bug_tags (tid);

The query must return the first 20 matching bugs ordered by bid, where the matching condition is that the bug's type must conform to a supplied bitmask (variable $TYPE), and the set of tags associated with the bug must be a superset of a supplied set of tagnames (denoted by $TAGNAMES in the prepared statement). After trying several approaches, below is the one that produces the best results. The basic idea is to first fetch all bugs satisfying the $TAGNAMES condition, and then test each one for the $TYPE condition. Moreover, if $TAGNAMES is empty, then we skip fetching those bugs and instead concentrate on just looping through the whole table (since we only want the first 20 rows, this is supposedly much faster). This latter test is the one that is causing the puzzling results.

WITH tids AS
            (SELECT tid
            FROM    tags
            WHERE   name = ANY ($TAGNAMES :: text[])),
    bugs_from_tags AS
            (SELECT DISTINCT t1.bid
            FROM    bug_tags AS t1
            WHERE   NOT EXISTS
                    (SELECT *
                    FROM    tids
                    WHERE   NOT EXISTS
                            (SELECT *
                            FROM    bug_tags AS t2
                            WHERE   t2.tid = tids.tid AND t2.bid = t1.bid)))
SELECT bid, type, title
FROM bugs
WHERE (type & $TYPE <> 0) AND
  (($TAGNAMES :: text[]) = '{}' OR bid IN (SELECT * FROM bugs_from_tags))
ORDER BY bid
LIMIT 20;

The version without the test is of course identical except for the WHERE clause, which looks like this instead:

WHERE (type & $TYPE <> 0) AND (bid IN (SELECT * FROM bugs_from_tags))

Below are the approximate ANALYZE results for a test database (with random data) consisting of 1,000,000 bugs, 2,000 tags, and 200,000 rows in bug_tags. The columns "with" and "without" refer to the query versions with/without the test against empty. The numbers are in milliseconds.

                            with    without
len($TAGNAMES) = 0:         4       1500
len($TAGNAMES) = 1:         13000   1600

Obviously, adding the test condition pays off handsomely when len($TAGNAMES) = 0, but results in disaster otherwise. This is puzzling because the condition is independent of each row, and therefore should only be evaluated once. Moreover, the results of EXPLAIN ANALYZE (which are quite long) do show that the plans with/without used in the case when len($TAGNAMES) = 1 are very different, though they should be identical!

Anyway, I suspect I stumbled upon a quirk (or bug?) of PostgreSQL's query planner. Any ideas on how I can get around it?

NOTE: Results of EXPLAIN ANALYZE for both tests are here and here.

Jon Smark
  • 2,528
  • 24
  • 31
  • 2
    Can you post the output of `explain analyze` (ideally to http://explain.depesz.com) –  Feb 07 '13 at 15:44
  • It looks like the parentheses in the main part of the query are not balanced. This may be relevant because of the AND/OR combination. – Daniel Vérité Feb 07 '13 at 16:24
  • @DanielVérité: I've fixed the unbalanced parentheses. This was solely a copy/paste issue; the original query was correct and hence this is not the issue. – Jon Smark Feb 07 '13 at 16:44
  • @a_horse_with_no_name: I've added the results of EXPLAIN ANALYZE. – Jon Smark Feb 07 '13 at 17:44
  • 1
    Look at the bottom of the plans - the slow one has decided to loop the IN ... bugs_from_tags scan. I'm guessing it's something to do with the OR – Richard Huxton Feb 07 '13 at 18:53
  • I would first try to replace the final `IN(...)` by the corresponding EXISTS, and maybe later replace the CTE by the corresponding JOINs or subselects. – wildplasser Feb 07 '13 at 19:32
  • I think you will find this question helpful: **[How to filter SQL results in a has-many-through relation](http://stackoverflow.com/questions/7364969/how-to-filter-sql-results-in-a-has-many-through-relation)** – ypercubeᵀᴹ Feb 09 '13 at 12:23

1 Answers1

0

Reading the plans, the primary difference is that the fast one first filters results and then joins them. The second filters and nested loop joins.

Two things strike me about the slow query.

The first is that it is choosing a bad plan. This can happen when you don't have enough memory for it to be sure that another plan is better. Try setting effective_cache_size a bit higher and work_mem a little bit higher.

The second is that is that the slow query is missing a hash aggregate. I would recommend changing the DISTINCT to a GROUP BY and see if this helps. My guess is that on the slow one, it is materializing the CTE, perhaps, and then running the query as a nested loop against it.

However beyond these and possibly putting more memory in the system, I don't see anything that strikes me as an immediate answer. I hope this is helpful, and if you can't get help here, the -perform list on PostgreSQL is often very good at helping people.

Chris Travers
  • 25,424
  • 6
  • 65
  • 182