0

I have this query where I'm trying to get the total number of records based on the given filters and trying to fetch the limited number of data as required for pagination in a single query. Earlier I was doing this in a two step process where i calculated the number of records first based on Optimising Select number of rows in PostgreSql for multiple group by fields and then got the required number of rows in a different query based on the page size and the offset needed

EXPLAIN select count (1) OVER () AS numrows , lower(column1) as column1, column2, column3, column4, column5, column6, column7, column8, sum(column9) as column9, sum(column10) as column10 
from tableName tablename 
where 
    column8 in (SOME 50-100 sets) 
    and column_date >= '2013-09-01' 
    and column_date < '2013-09-30' 
    group by lower(column1), column2, column3, column4, column5, column6, column7, column8 
    ORDER BY column9 desc 
    LIMIT 1000 OFFSET 0

XN Limit  (cost=1000134721702.61..1000134721702.63 rows=1000 width=67)
  ->  XN Merge  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
        Merge Key: sum(column9)
        ->  XN Network  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
              Send to leader
              ->  XN Sort  (cost=1000134721702.61..1000135118514.84 rows=158724893 width=67)
                    Sort Key: sum(column9)
                    ->  XN Window  (cost=107149638.61..113101822.10 rows=158724893 width=67)
                          ->  XN Network  (cost=107149638.61..108340075.31 rows=158724893 width=67)
                                Send to slice 0
                                ->  XN HashAggregate  (cost=107149638.61..108340075.31 rows=158724893 width=67)
                                      ->  XN Seq Scan on tableName tablename  (cost=0.00..67468415.44 rows=1587248927 width=67)
                                            Filter: ((column_date < '2013-09-30'::date) AND (column_date >= '2013-09-01'::date) AND (column8 = ANY ('{SOME 50-100 sets}'::integer[])))

version detail

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.666 

Thanks

Community
  • 1
  • 1
rohitmb
  • 151
  • 1
  • 3
  • 13
  • Side note: You need to upgrade your postgres a least to 8.4 . 8.2 is outdated and not supported. – Ihor Romanchenko Sep 30 '13 at 13:58
  • @IgorRomanchenko: he/she must already be using 8.4 as window functions weren't available in previous versions. –  Sep 30 '13 at 16:31
  • Are you sure you are using Postgres? The explain plan seems to indicate you are using some Postgres fork. –  Sep 30 '13 at 16:32
  • @a_horse_with_no_name it is Postgresql 8.2 – rohitmb Oct 01 '13 at 05:01
  • It can't be. Postgres 8.2 did **not** have window functions. Please post the result of `select version()` –  Oct 01 '13 at 06:12
  • @a_horse_with_no_name my bad it says PostgreSql 8.0.2 – rohitmb Oct 01 '13 at 10:48
  • Then it's not even 8.2, but 8.0 and I have no clue how you can use a window function with that ancient and obsolete version. I still think it must be some different fork. Postgres does not display things like "XN Limit" or "XN Merge" in the execution plan. –  Oct 01 '13 at 10:59
  • 3
    From the explain output, the database is amazon redshift which is based on postgresql 8.0.2 but [heavily modified](http://docs.aws.amazon.com/redshift/latest/dg/c_redshift-and-postgres-sql.html) – Daniel Vérité Oct 01 '13 at 11:25

1 Answers1

4

Redshift is not optimised for paginated record returns. It is designed to maximise the performance of analytic queries that involve many records but return only a small output.

If you intend to paginate your results I would strongly suggest sending the output of your SELECT to a temp table first. You can then do a simple (and fast!) COUNT(*) over that table and paginate against it without forcing the query to be re-executed.

Joe Harris
  • 13,671
  • 4
  • 47
  • 54