2

I need to do a process on all the records in a table. The table could be very big so I rather process the records page by page. I need to remember the records that have already been processed so there are not included in my second SELECT result.

Like this:

For first run, [SELECT 100 records FROM MyTable]

For second run, [SELECT another 100 records FROM MyTable]

and so on..

I hope you get the picture. My question is how do I write such select statement?

I'm using oracle btw, but would be nice if I can run on any other db too. I also don't want to use store procedure.

Thank you very much!

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
joe
  • 63
  • 1
  • 8
  • Are you sure you need to run multiple SELECTs? Why not open a cursor, fetch 100 records, process them, then fetch the next batch. That way you don't have to keep track of what you've processed, plus the query will be consistent as of the point in time it was started. – Jeffrey Kemp Dec 20 '10 at 03:20
  • Possible duplicate of [Paging with Oracle](https://stackoverflow.com/questions/241622/paging-with-oracle) – Vadzim May 02 '18 at 06:57

2 Answers2

2

Any solution you come up with to break the table into smaller chunks, will end up taking more time than just processing everything in one go. Unless the table is partitioned and you can process exactly one partition at a time.

If a full table scan takes 1 minute, it will take you 10 minutes to break up the table into 10 pieces. If the table rows are physically ordered by the values of an indexed column that you can use, this will change a bit due to clustering factor. But it will anyway take longer than just processing it in one go.

This all depends on how long it takes to process one row from the table of course. You could chose to reduce the load on the server by processing chunks of data, but from a performance perspective, you cannot beat a full table scan.

Ronnis
  • 12,593
  • 2
  • 32
  • 52
1

You are most likely going to want to take advantage of Oracle's stopkey optimization, so you don't end up with a full tablescan when you don't want one. There are a couple ways to do this. The first way is a little longer to write, but let's Oracle automatically figure out the number of rows involved:

select *
from
(
  select rownum rn, v1.*
  from (
    select *
    from table t
    where filter_columns = 'where clause'
    order by columns_to_order_by
  ) v1
  where rownum <= 200
)
where rn >= 101;

You could also achieve the same thing with the FIRST_ROWS hint:

  select /*+ FIRST_ROWS(200) */ *
  from (
    select rownum rn, t.*
    from table t
    where filter_columns = 'where clause'
    order by columns_to_order_by
  ) v1
  where rn between 101 and 200;

I much prefer the rownum method, so you don't have to keep changing the value in the hint (which would need to represent the end value and not the number of rows actually returned to the page to be accurate). You can set up the start and end values as bind variables that way, so you avoid hard parsing.

For more details, you can check out this post

Craig
  • 5,740
  • 21
  • 30
  • OMG Ponies - Why don't you think the first query example will work? It works fine for me.. – Craig Dec 20 '10 at 18:42