2

I have a table that has 200K+ rows, that I need to update on daily basis. My console app generates all this data on a daily run and needs to update the table with the results.

The situation is that I'm operating on a slow and network, and at each run there is less than 0.1% rows to update/insert/delete, so clearly there's space to optimize. The table is simple - key column, plus 2 nvarchar columns.

So my question is - what's the best approach in this particular case? I could always plow it over and do a SQLBulkCopy, but would SqlDataAdapter be more efficient?

Thx,

Mike

MikeP
  • 23
  • 1
  • 3
  • You could stage your modifications inside the database using `SQLBulkCopy` and kick off a stored procedure after it finishes uploading. – Dustin Kingen Jun 18 '13 at 19:15
  • Thanks Romoku, this would work, but as granadaCoder mentioned, passing the changes directly, serialized into an XML parameter to the SP would be quicker. – MikeP Jun 19 '13 at 09:25
  • I've never worked with Xml inside Sql Server, so it very well may be. – Dustin Kingen Jun 19 '13 at 11:31

1 Answers1

1

Ship XML with all the changes to a stored procedure.

One trip to the database.

Here is an old example here:

http://granadacoder.wordpress.com/2009/01/27/bulk-insert-example-using-an-idatareader-to-strong-dataset-to-sql-server-xml/

Here is a smaller example, but shows the basics.

http://www.mindfiresolutions.com/Sending-Multiple-Records-As-XML-To-SQL-Server-Stored-Procedure-1861.php

Send xml to stored procedure. Shred the xml to a @variable or #temp table. Do your UPDATES / INSERTS (or MERGE/UPSERT) using the @variable or #temp table.

EDIT:

http://weblogs.asp.net/dwahlin/archive/2009/09/30/passing-multiple-records-to-a-stored-procedure-in-sql-server.aspx

Another example.

What I like to do is create a strong dataset. Put your data into the strong dataset. Then send the ds.GetXml() to the stored procedure.

That way, you get strong typing (using the strong dataset), and you don't have to write your own xml-maker, you piggy back off of .GetXml(). Hint: After creating the strong dataset, remove the namespace (tempuri or something like that)

APPEND ( Spring, 2019 )

I no longer put items in a strong dataset (or any dataset) because of xml "bloat".

I write a custom PocoObject-To-Xml converter (that is xml-attribute based) and pass that down to the stored procedure.

Below is element based...and shows the xml-bloat

<Employees>
  <Employee>
      <EmployeeKey>123</EmployeeKey>
      <LastName>Smith</LastName>
      <FirstName>John</FirstName>
  </Employee>
  <Employee>
      <EmployeeKey>234</EmployeeKey>
      <LastName>Jones</LastName>
      <FirstName>Mary</FirstName>
  </Employee>
</Employees>

vs (more trimmed down)

<Employees>
  <Employee EmployeeKey="123" LastName="Smith" FirstName="John" />
  <Employee EmployeeKey="234" LastName="Jones" FirstName="Mary" />
</Employees>

And the custom converter code below:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;

namespace MyNamespace
{
    public class EmployeesToXmlConverter
    {
        public string ConvertToXmlString(ICollection<EmployeePoco> emps)
        {
            StringBuilder sb = new StringBuilder();
            XmlWriter writer = XmlWriter.Create(sb);

            writer.WriteStartElement("root");

            if (null != emps && emps.Any())
            {
                writer.WriteStartElement("Employees");
                foreach (EmployeePoco emp in emps)
                {
                    writer.WriteStartElement("Employee");
                    writer.WriteAttributeString("EmployeeKey", Convert.ToString(emp.EmployeeKey));
                    writer.WriteAttributeString("LastName", emp.LastName);
                    writer.WriteAttributeString("FirstName", emp.FirstName);
                    writer.WriteEndElement(); ////closing patient tag
                }

                writer.WriteEndElement(); ////closing emps tag

            }

            writer.WriteEndElement(); ////closing root tag
            writer.Close();
            string returnValue = sb.ToString();
            return returnValue;
        }
    }
}
granadaCoder
  • 26,328
  • 10
  • 113
  • 146
  • Good suggestion, I think I'm going to go with the XML approach. One additional thing is that when I generate each next batch of data, I cannot immediately determine which row has changed (compared to the destination table contents) - I need to load the current contents of the destination table and compare, to come up with a list of rows to update, insert OR delete. Now what's the most efficient way to do this? Get a DataReader on the destination table and use it to populate a custom List<>, that would serve as lookup? 200K+ is bit too much for a List<>. So a DataTable? – MikeP Jun 19 '13 at 09:22
  • Create a basic DTO Object (class with simple properties), Create a DTOBasicObjectCollection : List....and use an IDataReader to populate them. That is your slimmest approach. – granadaCoder Jun 19 '13 at 13:06
  • Check my answer here for some IDataReader to DTO....basic setup. http://stackoverflow.com/questions/13650443/why-is-datatable-faster-than-datareader/15908284#15908284 – granadaCoder Jun 19 '13 at 13:21
  • If the xml is what you are going to go with, you should mark this question as "answered", so it doesn't keep showing up with the unanswered filter applied. – granadaCoder Jun 19 '13 at 18:07
  • Implemented, works good. Filling up the List<> from IDataReader takes 12 seconds. Then I insert the generated data into a second list and get changed items using the .Except() method - result is almost immediate. One more question though - what is the benefit of using IDataReader explicitly? SqlDataReader already implements this interface, so what is the difference? – MikeP Jun 20 '13 at 15:51
  • http://stackoverflow.com/questions/383947/what-does-it-mean-to-program-to-an-interface – granadaCoder Jun 21 '13 at 00:30