7

I have a class that stores update statements. I cant execute these statements within my parallel for loop, as this is causing dead locks.

The statements are executed once the loop is completed, in a scenario for 100 rows it works okay. But certain scenarios generate over 100,000 statements. Executing these statements on a sequential loop takes too long.

What I wish to achieve is, upon adding 100 statements to the class, I execute these statements on a separate thread, and clear the statements variable, so that the next 100 can be added and executed.

I am new to multi threading. Is this achievable? If not what other option do i use to reduce the time for execution. Note that I can't change the statement logic as there are many other factors attached to it.

Let me clarify my scenario further. That table i have is more of a log table that keeps track of sql rows transferd to clients, so that i do not resend the same object twice. This is to reduce bandwidth usage as the objects are transferd over internet link.

Initialy i was executing each statement as soon as i got a reply from the client. This worked great when the loops were sequential, but proved to be too slow. So i opted to use parallel for loops. This is where all issues came up as the same table was being selected from, inserted and updated at virtually the same time causing dead lock.

So i decided to keep the statements in a list of strings to execute later.

I tried converting all strings into a single string using string.join but this gave me system out of memory exception. Thus executing them sequentially one by one. Now the transfer takes 5 minutes and the execution 30 minutes. So i am looking for a solution to this..

Kush
  • 245
  • 1
  • 3
  • 11
  • 7
    Can you show your statements? Probably your SQL code that needs to be written with threading in mind. – Mr. Mr. Jun 07 '13 at 14:18
  • 1
    Perhaps you should look at SQLBulkCopy http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx – Chris Moutray Jun 07 '13 at 14:20
  • Why does it cause deadlock? – Serge Jun 07 '13 at 14:21
  • 1
    You could use a pool of connections in combination with a parallel for loop to solve this. – Alxandr Jun 07 '13 at 14:21
  • @Corak that doesn't execute them in parallel threads. – Geeky Guy Jun 07 '13 at 14:21
  • 5
    Rather than calling statements one by one from your application, why not have your application build a SQL statement of ALL the updates and let SQL sort out the mess? – valverij Jun 07 '13 at 14:21
  • @Renan - Yes, I assumed each statement itself was small and the main problem was opening the connection each time. That's not necessarily the case. – Corak Jun 07 '13 at 14:24
  • You should build up a temp table, then make SET ("batch") based updates. – Serge Jun 07 '13 at 14:27
  • 1
    Using a single thread that actually performs the work is the right idea. It can monitor a queue of pending updates and periodically perform an update, e.g. when the count of pending updates reaches 100 or 5 seconds after the last update, whichever comes first. As valverij suggests, grouping them into one update will improve performance. – HABO Jun 07 '13 at 15:21
  • Do you mean i execute the 100000 statments as a single string?... – Kush Jun 07 '13 at 15:27
  • Passing 100,000 statements in a single string is probably not the best idea. If you could give us a little more information on the actual SQL work being done we could be more helpful. Are you building completely unrelated statements or are they all `UPDATE` statements against a single table? Do they update same fields? Do they update the data they just updated, i.e. you could eliminate some without causing problems? How critical is the timing? How critical is the data? ... – HABO Jun 07 '13 at 15:32
  • The staments basically consist of if not exist insert else update. Column being updated is a single column. Only that the table stores each client id and the object has value transferd. The timing is much cruical since the data needs to move every 20 mins. And the data is required for the day to day operations of the client. The statment affects only a single table.. the most crucial table for the entire dts – Kush Jun 07 '13 at 15:38
  • It sounds like you could pass a table-valued parameter to a stored procedure that would do all the work in a single `MERGE` statement. – HABO Jun 07 '13 at 15:43
  • You mean i create a single stored procedure, and pass the values one by one and execute each time, or an entire value array at the end and loop through the values and execute them in the sp? – Kush Jun 07 '13 at 15:48
  • Have a look [here](http://stackoverflow.com/questions/5595353/how-to-pass-table-value-parameters-to-stored-procedure-from-net-code) for an example of passing a table of values to a stored procedure. SQL Server is designed to perform set based operations, not RBAR (row by agonizing row). Pass the table of new client id's in a let a single `MERGE` statement "upsert" all of them. You'll be able to detect the change in performance. I wouldn't wait until I had 100,000 values to process, but experiment and try a few different sizes, e.g. 100, 200, 500, 1000. – HABO Jun 07 '13 at 15:56
  • Okay thanks. Let me try that option and see. Will revert with my findings. Hope the same works for sql 2000 – Kush Jun 07 '13 at 16:04
  • And that's why there are tags for various versions of SQL Server. You'll find that SS2K doesn't support either `MERGE` or table-valued parameters. Since you didn't tag your question with a database engine, I felt free to assume you have the latest SQL Server bits. Sorry. – HABO Jun 07 '13 at 16:23
  • Thats okay. I need this only for the server which is sql 2012. I was jus inquiring in case i ever needed to use it in sql 2000. – Kush Jun 07 '13 at 16:41

2 Answers2

2

Since it sounds like you continuosly keep getting new stuff to insert into your database, you can use a SqlTransaction, on which you execute your statements whenever they are available. Then, once in a while you commit the transaction.

Check MSDN for an example on how to use it.

EDIT: If you on the other hand get a lot of statements to execute in one heap, use SqlBulkCopy if possible, like LoztInSpace says.

SamiHuutoniemi
  • 1,576
  • 2
  • 16
  • 35
  • I don't understand how this helps. You should always do your database work in a transaction. Can you elaborate on your suggestion? – LoztInSpace Jun 07 '13 at 14:29
  • Very far from every SQL operation is part of a transaction. I was just suggesting that you can commit every now and then (with your background thread), instead of waiting for too big number of statements to build up. – SamiHuutoniemi Jun 07 '13 at 14:33
  • All the statments are in a single transaction. Jus that they are to many – Kush Jun 07 '13 at 15:29
  • Do this statements "appear" at the same time, or over time? – SamiHuutoniemi Jun 07 '13 at 18:02
1

I've had great success with using SQLBulkCopy into a temp table then executing an UPDATE against that (actually a MERGE, but same principle).

We got a 5 minute batch down to a few seconds.

LoztInSpace
  • 5,584
  • 1
  • 15
  • 27