3

I'm currently working on a program that is converting an old database to a newer style and one of the tasks I've been assigned is to remove some old binary columns and convert those into documents in our repository.

We have about 1.1 million rows in the table and my process takes about 12 hours to complete.

I'm looking for ways to make this all go a little faster and one thing I'm looking at is getting at the data quicker. If there a way I can get to the results of my query before it has completed? I've noticed that when I run a "select *" statement against this table in Query Analyzer I start getting results back immediately, but when I use ExecuteReader in .NET that query seems to take around 30 minutes before I can start reading the rows.

wonea
  • 4,783
  • 17
  • 86
  • 139
Sonny Boy
  • 7,848
  • 18
  • 76
  • 104
  • what does your actually code look like? It sounds like you're having some heavy where-queries on non-indexed columns. Doing a plain SELECT without filtering on even multi-million columns should be fast, also with ExecuteReader. – Pauli Østerø Jan 13 '11 at 00:30
  • Maybe you can try breaking up your query into a smaller number of rows and run a few threads to process them? – clyc Jan 13 '11 at 01:26
  • @Pauli - The problem isn't the indexes so much as the amount of data. I'm dealing with several BLOB columns. :( – Sonny Boy Jan 13 '11 at 17:51

2 Answers2

1

It may help to use command.ExecuteReader(CommandBehavior.SequentialAccess). However, doing so requires you to read columns in order, and requires you to read binary values in a different way. I recommend reading the documentation on SequentialAccess so you know what other changes to your code might need to be made.

Consider that the same query runs instantly in Query Analyzer. That tells me that the query does not inherently take 30 minutes to return the first result, or it would do so in Query Analyzer. I think the handling of large binary values is a likely explanation for this discrepancy. One quick way to test would be to change the query to select everything EXCEPT the binary columns, and see if that changes the time that ExecuteReader takes. If the execution time is the same without binary columns, then it's unlikely that SequentialAccess will help. If it's faster without the binary columns, then SequentialAccess may be worth using.

Joel Mueller
  • 28,324
  • 9
  • 63
  • 88
  • Just to clarify, the full return of the query in QA still takes 30 minutes, it just starts returning part of the data immediately. I'm managed to replicate the effect in my code by using SqlCommand.BeginExecuteReader and I've significantly improved performance. I'll try this suggestion next and see what I can get the total time down to. – Sonny Boy Jan 13 '11 at 17:53
  • Most of my total performance gain came from using SqlCommand.BeginExecuteReader, but this made a good impact as well. Total time is down from 12 hours to 1.5. – Sonny Boy Jan 25 '11 at 23:00
0

One bit of advice is to not use select * if you don't need all the columns in the query. Perhaps this question will help too: Performance optimization strategies of last resort

Community
  • 1
  • 1
Tim Carter
  • 590
  • 3
  • 9