Loading up data from a SQL Server database?

Question

I have a huge table with rows in the millions. I need to process all of them in a program.

I am interested in only one column which is studentId. How do I write a query to load up 2000 of those items and then feed it to the program to process and then get the next batch of 2000 and so on?

The problem is how do I keep track of which items I have already processed, the list isn't sorted.

EDIT: The process I need to do is this: I have a table in the form of [studentId, name, DepartmentId]

The number of students here could easily be 30 to 40 million. The various departments that they belong to typically could be in order of thousands...So DepartmentId's could be in the range of 5000 to 6000. The rows could be interspersed. Now I need to get all students belonging to a specific department, and process them in a program. I cannot load a million student ids at once, so i need to do some sort of batching. Thanks.

You can do it manually using ROW_NUMBER rn... WHERE rn BETWEEN x and y Put that in a loop and you increment the variables by 2000 for each iteration — Mihai, Nov 18 '14 at 12:49

score 0 · Answer 1 · edited Nov 28 '14 at 03:09

0

If that is MYSQL you can do something like this

SELECT studentId from [TABLE_NAME] LIMIT 0,2000

you just need to loop around your code or SP to and change the LIMIT numbers.

edited Nov 28 '14 at 03:09

AbcAeffchen

14,400
15
47
66

answered Nov 18 '14 at 13:01

Marck Antonio

134
6

1

But it's sql server. Read the tags. – Dan Bracuk Nov 18 '14 at 13:37

score 0 · Answer 2 · edited Nov 18 '14 at 13:48

0

If we run our select multiple times, SQL Server doesn't guarantee to have the same order in the results.

So you have two choices:

Order the table and use OFFSET FETCH clause. ( or use TOP clause, but TOP isn’t a standard feature). You can use the ROW_NUMBER with OVER. But in this case OFFSET is simpler and better for performance.
Or use your ID key in WHERE clause (WHERE BETWEEN).

edited Nov 18 '14 at 13:48

marc_s

732,580
175
1,330
1,459

answered Nov 18 '14 at 13:07

Chehre

16
1

score 0 · Answer 3 · edited May 23 '17 at 12:27

0

You can use a SqlDataReader, read each batch, process it, and keep reading, until you finish. Only close the reader when you finish all the batches.

You can change the isolation level if locking is a problem.

edited May 23 '17 at 12:27

Community

1
1

answered Nov 18 '14 at 13:18

JotaBe

38,030
8
98
117

Loading up data from a SQL Server database?

3 Answers3