59

I have a table I'm doing an ORDER BY on before a LIMIT and OFFSET in order to paginate.

Adding an index on the ORDER BY column makes a massive difference to performance (when used in combination with a small LIMIT). On a 500,000 row table, I saw a 10,000x improvement adding the index, as long as there was a small LIMIT.

However, the index has no impact for high OFFSETs (i.e. later pages in my pagination). This is understandable: a b-tree index makes it easy to iterate in order from the beginning but not to find the nth item.

It seems that what would help is a counted b-tree index, but I'm not aware of support for these in PostgreSQL. Is there another solution? It seems that optimizing for large OFFSETs (especially in pagination use-cases) isn't that unusual.

Unfortunately, the PostgreSQL manual simply says "The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large OFFSET might be inefficient."

James Tauber
  • 3,386
  • 6
  • 27
  • 37

6 Answers6

53

You might want a computed index.

Let's create a table:

create table sales(day date, amount real);

And fill it with some random stuff:

insert into sales 
    select current_date + s.a as day, random()*100 as amount
    from generate_series(1,20);

Index it by day, nothing special here:

create index sales_by_day on sales(day);

Create a row position function. There are other approaches, this one is the simplest:

create or replace function sales_pos (date) returns bigint 
   as 'select count(day) from sales where day <= $1;' 
   language sql immutable;

Check if it works (don't call it like this on large datasets though):

select sales_pos(day), day, amount from sales;

     sales_pos |    day     |  amount  
    -----------+------------+----------
             1 | 2011-07-08 |  41.6135
             2 | 2011-07-09 |  19.0663
             3 | 2011-07-10 |  12.3715
    ..................

Now the tricky part: add another index computed on the sales_pos function values:

create index sales_by_pos on sales using btree(sales_pos(day));

Here is how you use it. 5 is your "offset", 10 is the "limit":

select * from sales where sales_pos(day) >= 5 and sales_pos(day) < 5+10;

        day     | amount  
    ------------+---------
     2011-07-12 | 94.3042
     2011-07-13 | 12.9532
     2011-07-14 | 74.7261
    ...............

It is fast, because when you call it like this, Postgres uses precalculated values from the index:

explain select * from sales 
  where sales_pos(day) >= 5 and sales_pos(day) < 5+10;

                                    QUERY PLAN                                
    --------------------------------------------------------------------------
     Index Scan using sales_by_pos on sales  (cost=0.50..8.77 rows=1 width=8)
       Index Cond: ((sales_pos(day) >= 5) AND (sales_pos(day) < 15))

Hope it helps.

Mike Ivanov
  • 969
  • 7
  • 15
  • 3
    There's a lenghty and very detailed blog post on this technique in [select * from depesz blog: Pagination with fixed order](http://www.depesz.com/index.php/2011/05/20/pagination-with-fixed-order/) – Tometzky Jul 08 '11 at 08:04
  • 1
    @Tometzky - very nice idea! As an improvement I would suggest using window functions (9.0+ only) over the grouping column. – Mike Ivanov Jul 08 '11 at 16:50
  • 4
    Great. So, now every time you insert single value into table, it re-calculates this for each item inside table? – Konstantine Rybnikov Oct 18 '13 at 10:41
  • 3
    @KonstantineRybnikov Hmm.. No, but you really don't need to recalculate the index as long as you insert entries strictly in the order of their dates and never delete them (which is a good idea anyways). In this case the record positions will never change. – Mike Ivanov Oct 18 '13 at 14:55
  • @MikeIvanov does PostgreSql use this kind of optimization? (does it only recalc ones that need to) – Konstantine Rybnikov Oct 18 '13 at 16:16
  • I'm assuming this doesn't work if you have an additional column called salesperson and you want to dynamically pull reports by each salesperson? – nomad Aug 11 '17 at 21:32
  • the insert query is wrong, as it refers to a non-exsisting columns "s" – Sh eldeeb Aug 20 '23 at 11:49
21

I don't know anything about "counted b-tree indexes", but one thing we've done in our application to help with this is break our queries into two, possibly using a sub-query. My apologies for wasting your time if you're already doing this.

SELECT *
FROM massive_table
WHERE id IN (
    SELECT id
    FROM massive_table
    WHERE ...
    LIMIT 50
    OFFSET 500000
);

The advantage here is that, while it still has to calculate the proper ordering of everything, it doesn't order the entire row--only the id column.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
  • It's really good solution when to use crosstab() function. My first queries (limit 100, offset 0) continues for 14ms, but last one (limit 100, offset 14900) continues almost 3 seconds. With this solution all my queries are above 12ms(!) – Paweł BB Drozd Oct 07 '17 at 21:31
  • This is actually a pretty good solution is you are limited with `LIMIT` and `OFFSET` pagination because of UI or complex query where keyset pagination won't cover. I did a quick test with a somewhat complicated query with an offset of `9e6` on a table of `1e7` rows with three columns. This method is around 270% faster. – KuN May 12 '21 at 03:58
  • 1
    Per the PostgreSQL documentation: "When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique order. Otherwise you will get an unpredictable subset of the query's rows. You might be asking for the tenth through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless you specified ORDER BY." See https://www.postgresql.org/docs/current/queries-limit.html – Philippe Jun 30 '22 at 14:47
4

Instead of using an OFFSET, a very efficient trick is to use a temporary table:

CREATE  TEMPORARY TABLE just_index AS
SELECT ROW_NUMBER() OVER (ORDER BY myID), myID
FROM mytable;

For 10 000 000 rows it needs about 10s to be created. Then you want to use SELECT or UPDATE your table, you simply:

SELECT * FROM mytable INNER JOIN (SELECT just_index.myId FROM just_index WHERE row_number >= *your offset* LIMIT 1000000) indexes ON mytable.myID = indexes.myID

Filtering mytable with only just_index is more efficient (in my case) with a INNER JOIN than with a WHERE myID IN (SELECT ...)

This way you don't have to store the last myId value, you simply replace the offset with a WHERE clause, that uses indexes

Rolintocour
  • 2,934
  • 4
  • 32
  • 63
  • Thanks! I improved the performance putting all the formated information in the temp table directly, so I avoided the INNER JOIN and filter directly on the temp table – Genarito Nov 03 '20 at 15:23
2

It seems that optimizing for large OFFSETs (especially in pagination use-cases) isn't that unusual.

It seems a little unusual to me. Most people, most of the time, don't seem to skim through very many pages. It's something I'd support, but wouldn't work hard to optimize.

But anyway . . .

Since your application code knows which ordered values it's already seen, it should be able to reduce the result set and reduce the offset by excluding those values in the WHERE clause. Assuming you order a single column, and it's sorted ascending, your app code can store the last value on the page, then add AND your-ordered-column-name > last-value-seen to the WHERE clause in some appropriate way.

Mike Sherrill 'Cat Recall'
  • 91,602
  • 17
  • 122
  • 185
  • 4
    it doesn't necessarily know what it's already seen, as pagination would require the ability to jump to, say, page 1000 – James Tauber Jul 08 '11 at 02:23
  • That's probably application-specific. Google lets you jump 9 pages ahead or 9 pages back, but doesn't let you just jump to page 1000. Google also seems to encode the starting item number in the URL, which I imagine could be used to reduce the size of the result set and the size of the offset. – Mike Sherrill 'Cat Recall' Jul 08 '11 at 12:17
  • 3
    One common example of this sort of this access pattern is a forum topic with thousands of posts. Users jump to offset 0 to read the original post, and then some large offset to read the latest responses, and then some random offset to see points of interest in the discussion (like deep links or replies to their own posts) – danneu Jan 06 '15 at 03:51
1

recently i worked over a problem like this, and i wrote a blog about how face that problem. is very like, i hope be helpfull for any one. i use lazy list approach with partial adquisition. i Replaced the limit and offset or the pagination of query to a manual pagination. In my example, the select returns 10 millions of records, i get them and insert them in a "temporal table":

create or replace function load_records ()
returns VOID as $$
BEGIN
drop sequence if exists temp_seq;
create temp sequence temp_seq;
insert into tmp_table
SELECT linea.*
FROM
(
select nextval('temp_seq') as ROWNUM,* from table1 t1
 join table2 t2 on (t2.fieldpk = t1.fieldpk)
 join table3 t3 on (t3.fieldpk = t2.fieldpk)
) linea;
END;
$$ language plpgsql;

after that, i can paginate without count each row but using the sequence assigned:

select * from tmp_table where counterrow >= 9000000 and counterrow <= 9025000

From java perspective, i implemented this pagination through partial adquisition with a lazy list. this is, a list that extends from Abstract list and implements get() method. The get method can use a data access interface to continue get next set of data and release the memory heap:

@Override
public E get(int index) {
  if (bufferParcial.size() <= (index - lastIndexRoulette))
  {
    lastIndexRoulette = index;
    bufferParcial.removeAll(bufferParcial);
    bufferParcial = new ArrayList<E>();
        bufferParcial.addAll(daoInterface.getBufferParcial());
    if (bufferParcial.isEmpty())
    {
        return null;
    }

  }
  return bufferParcial.get(index - lastIndexRoulette);<br>
}

by other hand, the data access interface use query to paginate and implements one method to iterate progressively, each 25000 records to complete it all.

results for this approach can be seen here http://www.arquitecturaysoftware.co/2013/10/laboratorio-1-iterar-millones-de.html

0

you can also use partitioned tables, in this case you divide the massive data into small chunks, thus the offset value can always be a bit small.

Sh eldeeb
  • 1,589
  • 3
  • 18
  • 41