0

I have a small console application written on C# .NET 2.0. The application queries a database and takes a certain amount of id numbers, by which it then addresses a web service and get some "statuses" according to the id numbers. Then the statuses are written back to the database. Trough out the whole process I keep an open connection to the database, which I'm not sure to be the best way for batch processing as long as the application blocks the database engine and other clients are unable to work properly. Opening and closing a connection for each operation does not seem wise as well.

What would be considered as the best way to handle those batch processes?

Alexidze
  • 183
  • 4
  • 14
  • Why opening and closing a connection is not wise? And why do you need to keep your connection open? Can't you have a stored procedure to handle all ids update for you? I know that SQL Server cache the such queries so you don't need to worry about the performance there. – manman Nov 26 '12 at 10:17
  • Because I've tried opening and closing the connection at each update and there are about 20 000 each time. From the second part of your comment I see that the best way would be to buffer all updates and then send it all together to an SP rather than updating values one by one? – Alexidze Nov 26 '12 at 10:30
  • 1
    Yes, you can buffer them and then have an SP to handle the updates for you. – manman Nov 26 '12 at 10:35

2 Answers2

1

If you have loads of ids and want to update them all in a one go (a batch), you can form two CSV strings and pass them as parameters to a stored procedure(sp).

Also you need a function to split (Here are a few) those CSVs inside the sp and then update your table.

Ex;

//Make sure you have a relation between id and statusid here. i.e. same index
string csvIds = "101,2234,349,..."; //CSV for ids
string csvStatusIds = ="1,2,3,..."; //CSV for statusIds

//Pass above csv lists to the following stored procedure...

//Create a stored procedure to pass above ids.    
CREATE PROCEDURE [dbo].[updateStatuses] 

    @Ids VARCHAR(Max),
    @StatusIds VARCHAR(Max)

AS
BEGIN
     DECLARE @tblIds TABLE (key1 INT IDENTITY(1,1), myId INT)
     DECLARE @tblStatusIds TABLE (key2 INT IDENTITY(1,1), myStatusId INT)

     INSERT INTO @tblIds
        SELECT ITEMS FROM dbo.Split(@Ids,',')
     INSERT INTO @tblStatusIds
        SELECT ITEMS FROM dbo.Split(@StatusIds,',')

     --//Do your status update here
     UPDATE yt.statusId = s.myStatusId
     FROM [YourTable] yt JOIN @tblIds i 
                ON yt.id = i.myId JOIN @tblStatusIds s
                ON i.key1 = s.key2

END
Community
  • 1
  • 1
Kaf
  • 33,101
  • 7
  • 58
  • 78
  • Thank you for such a detailed response. There is a bit of extra data rather than just Id's, like comment strings etc. do you see this solution optimal in that case as well? – Alexidze Nov 26 '12 at 10:49
  • 1
    Yes, **You need to be careful when passing comments as CSV** as there can be `,` (commas) which will break the split function. In that case you can use a different delimiter (`such as | or ~`). – Kaf Nov 26 '12 at 11:06
0

You can try to implement batch inserts/reads yourself, but surprisingly this is where NHiberate is good at. I would suggest setting it up and trying, because it's relatively small amount of work and the gain can be incredible.

kubal5003
  • 7,186
  • 8
  • 52
  • 90