1

I have a system that needs to suck up an entire MS SQL database. Currently it does so with something like:

select top 1000 from table where id > 0 order by id;

Then, for the next chunk:

select top 1000 from table where id > 1000 order by id;

And then:

select top 1000 from table where id > 2000 order by id;

And so forth.

In MySQL, I've learned that doing LIMIT and OFFSET queries is brutally slow because the database has to first sort the results, then scan over the OFFSET count. When that count gets big, life starts to suck as the read count skyrockets.

My question is this: does the same problem apply to TOP? Put another way, can I expect a really high read count when I run these queries on a database with, say 10,000,000 records, at the time when id > 9,999,000? If so, are there any ways to handle this better?

Community
  • 1
  • 1
mlissner
  • 17,359
  • 18
  • 106
  • 169

1 Answers1

3

It will be very fast if ID is indexed. If that column is not indexed then it would case a full table scan.

I would suggest the following in addition:

  select * from table where id > 0 and id <= 1000 order by id ;

This way if you don't have all records you don't have bugs (duplicates).

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • It won't cause an index scan of 9,999,000 rows for the last chunk? I think that's part of my worry. I'm pretty sure MySQL **would** do that. – mlissner Aug 14 '12 at 00:18
  • @mlissner this is the question I answered. If the column id has an index then it will not. This is fundamental to the way SQL Server works. – Hogan Aug 14 '12 at 00:19