6

There is table T(user, timestamp,...) with 100 ml+ records (PostgreSQL 9.1).

A query of the form

SELECT * 
FROM T 
WHERE user='abcd' 
ORDER BY timestamp 
LIMIT 1

is using timestamp index instead of user index when there are ~100000 user records.

Using timestamp index will always give poor results(20+ sec) as it eventually scans all records. Bypassing timestamp index by changing query to use ORDER BY DATE(timestamp) will result in query to resort to user index and give results which is less than 100 ms.

  • Total RAM: 64 GB
  • shared_buffers: 16 GB
  • work_mem: 32 MB

Why is postgresql ignoring user index and is using timestamp index instead (timestamp index will need to see all records)? Are there any postgresql config params which can be altered to make the query use username index itself?

vyegorov
  • 21,787
  • 7
  • 59
  • 73
Anoop
  • 1,757
  • 1
  • 19
  • 24
  • 2
    http://wiki.postgresql.org/wiki/SlowQueryQuestions –  Nov 25 '14 at 08:26
  • 1
    Post the `EXPLAIN` results – DrColossos Nov 25 '14 at 09:12
  • Most likely problem - stale or misconfigured statistics on the table. – Ihor Romanchenko Nov 25 '14 at 09:13
  • 1
    This is a valid question, please, do not close it! – vyegorov Nov 25 '14 at 09:20
  • 1
    @vyegorov I can't speak for the other close vote, but my vote was to migrate to dba.stackexchange.com, not to close the question – GarethD Nov 25 '14 at 09:38
  • @GarethD, I think that this *is* programming question, as fixing the case on a current state of affairs requires rewriting the query. Performance issues should be taken by programmers (along with DBAs) thinking in advance how their queries will behave and planning indexes accordingly. – vyegorov Nov 25 '14 at 11:38
  • @vyegorov Stackoverflow is the generic Q&A site therefore it is possible to make an argument for almost any question to belong on this site, therefore *"Off-topic"* is a slight misnomer, what it really means is *"Better suited to another site"*. It is not a vindictive close vote, nor one that should imply anything wrong with the question, quite simply you are more likely to get a better answer if you ask on the most relevant site. In the same way I can buy a laptop from my local supermarket and it could be fine, and good value, I would still rather go to a specialist electronics retailer. – GarethD Nov 25 '14 at 11:55

1 Answers1

7

Nice question, I have came around this problem a while ago.

Why this happens?

You should look into the number of user='abcd' values in your stats like this:

SELECT attname, null_frac, ag_width, n_distinct,
       most_common_vals, most_common_freqs, histogram_bounds
  FROM pg_stats
 WHERE table_name='T';

My guess is — this value occurs quite often and you'll find it in the most_common_vals output. Picking the same element from the most_common_freqs you'll get the ratio for the value, multiply it by total number of rows (can be obtained from pg_class) to get number of rows that are estimated to have 'abcd' value.

Planner assumes all values to have a linear distribution. In reality things are different of course. Also, currently there're no correlated stats (although some work is being done in this direction).

So, let's take user='abcd' value, having 0.001 ratio (per question) in the corresponding most_common_freqs entry. This means value will occur every 1000 rows (assuming linear distribution). It appears, that if we'll scan table in any way we'll hit our user='abcd' in some 1000 rows. Sounds that it should be fast! Planner "thinks" the same and chooses index on timestamp column.

But it is not. If we'll assume, that your table T contains logs of user activity, and user='abcd' was on vacation for the last 3 weeks, then this means we'll have to read quite a lot of rows from the timestamp index (3 weeks worth of data) before we actually hit the row we want. Well, you as DBA know this, but planner assumes linear distribution.

So, how to fix?

You'll have to trick the planner to use what you need, as you have more knowledge of your data.

  1. Use OFFSET 0 trick with subquery:

    SELECT *
      FROM
      (
         SELECT * FROM T WHERE user='abcd' OFFSET 0
      )
      ORDER BY timestamp 
      LIMIT 1;
    

    This trick protects query from inlining, therefore inner part is executed on it's own.

  2. Use CTE (named subquery):

    WITH s AS (
            SELECT * FROM T WHERE user='abcd'
    )
    SELECT *
      FROM s
     ORDER BY timestamp 
     LIMIT 1;
    

    Per documentation:

    A useful property of WITH queries is that they are evaluated only once per execution of the parent query, even if they are referred to more than once by the parent query or sibling WITH queries.

  3. Use count(*) for aggrgated queries:

    SELECT min(session_id), count(*) -- instead of simply `min(session_id)`
      FROM T 
     WHERE user='abcd' 
     ORDER BY timestamp 
     LIMIT 1;
    

    This is not really applicable, but I wanted to mention it.

And please, consider upgrading to 9.3.

P.S. More on row estiamtes in the docs of course.

Community
  • 1
  • 1
vyegorov
  • 21,787
  • 7
  • 59
  • 73
  • Thanks vyegorov. That perfectly explains the behavior. I confirmed by checking pg_stats. Are there any improvements in 9.3 ? – Anoop Nov 25 '14 at 10:17
  • @Anoop, not for this case, sorry. But it is generally a good idea to stick to the current major version, as it includes other performance and security improvements. – vyegorov Nov 25 '14 at 11:16