1

I need to pull back the first 300 rows from a 10MM row table, as well as getting a count of the total number of matching records.

I can do this in two queries, something like:

SELECT * FROM table WHERE field = value LIMIT 300;
SELECT count(*) FROM table WHERE field = value;

Or I could use an OVER():

SELECT *, COUNT(*) OVER() AS total FROM table WHERE field = value LIMIT 300;

Which would be the most efficient? I don't care about the need to run two queries, I'm after the most efficient solution. I'm no expert, and I've tried to run an "explain" but it doesn't make much sense to me. This is running on Amazon Redshift.

Kong
  • 8,792
  • 15
  • 68
  • 98
  • [Related](http://stackoverflow.com/q/7943233/314291) - it seems there are quick ways to get approximate row counts in PostGres from `pg_class`, without the need to count all rows. – StuartLC Jun 09 '15 at 05:16
  • 1
    When it comes to performance, first set *goals*, then write simple, clear code that expresses what you *want*. Then *measure* the performance, and only if it doesn't meet the goals should you start digging into it. – Damien_The_Unbeliever Jun 09 '15 at 06:33
  • What do you mean by the first rows? Don't you need to define order? Why do want only 300? – Guy Jun 11 '15 at 22:00
  • What happened when you ran them both? What does the execution plan (using `explain analyze` not just `explain`) say? **Edit** your question and add the output of `explain analyze` for the single statement (make sure it's formatted) –  Aug 25 '15 at 14:05

1 Answers1

0

if your SortKey is timestamp field, the most efficient to run will be

select *
from(
select * , count(*) over() as total,
row_number () over(order by timestamp) as rank
from table
where filed =value)
where rank<301
user3600910
  • 2,839
  • 4
  • 22
  • 36