0

Usually I only run this loop with less than 50 and it runs fine. Now I am trying to scale up and with 5k+ it takes serveral minuts.

Public Shared Sub Add(type As BufferType, objIDs As List(Of Integer), userToken As String)
    Dim timeInUTCSeconds As Integer = Misc.UTCDateToSeconds(Date.Now)
    For Each objID As Integer In objIDs
        Dim insertStmt As String = "IF NOT EXISTS (SELECT ObjectID From " & TableName(type) & " where ObjectID = " & objID & " and UserToken = '" & userToken.ToString & "')" & _
        " BEGIN INSERT INTO " & TableName(type) & "(ObjectID,UserToken,time) values(" & objID & ", '" & userToken.ToString & "', " & timeInUTCSeconds & ") END" & _
        " ELSE BEGIN UPDATE " & TableName(type) & " set Time = " & timeInUTCSeconds & " where ObjectID = " & objID & " and UserToken = '" & userToken.ToString & "' END"
        DAL.SQL.Insert(insertStmt)
    Next
End Sub
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
radbyx
  • 9,352
  • 21
  • 84
  • 127
  • 6
    You should really start to use sql parameters instead of concatening strings even if your input is under your control. That might change in future and you have a sql injection vulnerability. Using parameterized queries can also improve performance. Your whole `SQL`-class is the problem. – Tim Schmelter Jan 05 '16 at 10:00
  • 1
    Similar questions are asked all the time http://stackoverflow.com/questions/334919/mass-insert-into-sql-server – Alex Jan 05 '16 at 10:01
  • 1
    TableName(type) doesn't change inside the loop, so just define it one time outside the loop, userToken is already a string, what is the purpose of ToString()? Now the big changes are 1) Use Parameters 2) Use a StoredProcedure passing a TableValue parameter, 3) Ditch the SQL class – Steve Jan 05 '16 at 10:01
  • [BulkInsert](https://msdn.microsoft.com/de-de/library/ms188365%28v=sql.120%29.aspx) and [OpenRowSet](https://msdn.microsoft.com/de-de/library/ms175915%28v=sql.120%29.aspx) – Alex B. Jan 05 '16 at 10:03
  • Great candidate for `MERGE` and `Table Valued Parameter` (if SQL Server 2012+) .Now you are calling DB multiple times for each row checking for existence then INSERT/UPDATE. Using MERGE and TVP there will be just one call. – Lukasz Szozda Jan 05 '16 at 10:06
  • 2
    Which DBMS are you using? –  Jan 05 '16 at 10:08
  • @TimSchmelter Thanks for your answer. We are already aware of the vullnerability and is working to fix it. My focus is at the scaling at the moment. – radbyx Jan 05 '16 at 10:27
  • I am using 2012, but we have to be compatible with 2008 also for now. It would be nice if we could leave the 2008 behind and take full use of 2012. – radbyx Jan 05 '16 at 10:29
  • @Steve Good points about the two refactoring issues, you are right, and I will clean it up. 1) This is not my concern for now, but I am aware of it indeed 2) I will try and starting implement this now. Making a SP that takes like 1k id's would reduce the number of times I have to establish a connection to the DB with 1k. That should help alot, because most of the time is spend open and closing that DB-connection as I understand it. 3) For this function - yes. – radbyx Jan 05 '16 at 10:43
  • Reconsider your decision about parameters. Query Plan Pollution has impact on performances. This is a bit old, but very useful to read also today [How Data Access Code Affects Database Performances](https://msdn.microsoft.com/en-us/magazine/ee236412.aspx) However if you go for the StoredProcedure and TableValued parameter then.... – Steve Jan 05 '16 at 10:50
  • I'm not saying I take it lightly or it won't be fixed. I am saying it's not my focus for now – radbyx Jan 05 '16 at 10:58
  • @coder You are right. SqlBulkCopy looks like it would do the job, I'll give it try now. – radbyx Jan 06 '16 at 08:59
  • 1
    @radbyx: yes give it a try atleast :) – Nad Jan 06 '16 at 09:34
  • @coder That class `SqlBulkCopy`was awesome. I went from minuts to few miliseconds with 10k+. You should make an answer so I can accept it, or else I will. I would prefer if you did it. Btw, I need the same for mass deleting, but what I could google there is no such thing aka sqlBulkDelete, and instead one should do a DELETE with a good where clause, any thoughts? – radbyx Jan 07 '16 at 06:55
  • @radbyx: Added an answer, it may not directly help you. but it is the exact documentation and a bit of code which makes your fact clear about how to deal with your requirement. – Nad Jan 07 '16 at 10:55

3 Answers3

1

Best option is to move whole work to SQL side. Pass your objIDs list as table parameter to stored procedure and utilize MERGE..UPDATE..INSERT statement.

If you require VB side code, there is number of optimization you can do.

  • SQL Parameters. Must do.
  • Check DAL.SQL class if it handles connection properly.
  • Construct single batch from commands and run it as whole batch.
1

(I'm assuming a reasonably recent version of SQL Server on probability given it's a .Net question.)

  1. If at all possible, move this into a stored procedure that takes either a table-valued parameter or delimited string of IDs.
  2. Parameterise your queries. When the DBMS is executing a string of code with no parameters, it can't work out which bits are constant and which are parameters - so it works out a new execution plan each time. With parameters for the data, it'd reuse execution plans and run faster.
  3. With really big repetitive jobs, connection latency can become a surprisingly big issue. If you really have to execute something like this as ad-hoc SQL from the calling app (which I wouldn't for this, but it's your project), try batching the statements and executing them en masse rather than issuing 5000 individual connections. Should be significantly faster.
eftpotrm
  • 2,241
  • 1
  • 20
  • 24
  • Im starting out trying 1. For 2. and 3. did you mean that's as an alternative or is it combined with 1.? – radbyx Jan 05 '16 at 10:45
  • 1
    3's if you really have to execute lots of individual commands from your app. I'm fairly sure in this case you should be able to write a sproc that'll do this all in one call (suggestion 1), but I'm noting point 3 just in case! – eftpotrm Jan 05 '16 at 10:52
1

You should always use SQLBULKCOPY for inserting a large amount of data. You can have a look at here which has been always considered as the best practice for inserting a large amount of the data into the table.

A demo code will make your fact clear, which has been taken from here

private static void PerformBulkCopy()
{
    string connectionString =
            @"Server=localhost;Database=Northwind;Trusted_Connection=true";
    // get the source data
    using (SqlConnection sourceConnection =
            new SqlConnection(connectionString))
    {
        SqlCommand myCommand =
            new SqlCommand("SELECT * FROM tablename", sourceConnection);
        sourceConnection.Open();
        SqlDataReader reader = myCommand.ExecuteReader();

        // open the destination data
        using (SqlConnection destinationConnection =
                    new SqlConnection(connectionString))
        {
            // open the connection
            destinationConnection.Open();

            using (SqlBulkCopy bulkCopy =
            new SqlBulkCopy(destinationConnection.ConnectionString))
            {
                bulkCopy.BatchSize = 500;
                bulkCopy.NotifyAfter = 1000;
                bulkCopy.SqlRowsCopied +=
                    new SqlRowsCopiedEventHandler(bulkCopy_SqlRowsCopied);
                bulkCopy.DestinationTableName = "Tablename";
                bulkCopy.WriteToServer(reader);
            }
        }
        reader.Close();
    }
}

Also do not forget to read the documentation related to the Batchsize

Hope that helps.

Community
  • 1
  • 1
Nad
  • 4,605
  • 11
  • 71
  • 160