1

I'm struggling with a SQL Server database where I want to keep references to which I add prices from several suppliers.

I'm reading from a .xlsx file reference and price and storing the prices in a SQL Server database. I have to check if the ref already exists - if Yes I update if not then I need to Insert.

One ref takes about 100-120ms to be inserted/updated into database - the most of the time is consumed by ExecuteReader (tried also with ExecuteNonQuery) circa 70 ms.

My question: is there a possibility to make it faster? With 1.000.000 references it can take about 33 hours to update...

Here is the code:

string ref;
string price;

cs.Open();

for (int i = 1; i < lastRow; i++)
{
    ref = (xlWorksheet.Cells[i, 1] as Excel.Range).Value2.ToString();
    price = (xlWorksheet.Cells[i, 2] as Excel.Range).Value2.ToString();
    price = price.Replace(',', '.');

    command = new SqlCommand("UPDATE ORG SET "+supplierName+" = '" + price + "' WHERE REFERENCJE = '" + ref + "'"
            + " IF @@ROWCOUNT = 0"
            + " INSERT INTO ORG(" + supplierName + ", REFERENCJE)"
            + " VALUES('" + price + "', '" + ref + "') ", cs);
            command.ExecuteReader().Close(); //consume circa 75ms
}

cs.Close();

Sorry for my English.

Problem solved (of course thanks to You guys) I You suggested - I decided to create temp table and put there all imported data. Next Merge it with the main table. Also I changed import from .xlsx from 1 item to range of 2 columns and 100 rows at once, which I put in DataTable and insert it with batch method. Maybe my solution will help someone. I updated 1.000.000 entries in less then 15mins.

Code:

//checking if temp table exists, if not creates it
        command = new SqlCommand("IF NOT (EXISTS (SELECT *"
                         + " FROM INFORMATION_SCHEMA.TABLES"
                         + " WHERE TABLE_NAME = '" + tempTable + "')) CREATE TABLE " + tempTable + "(REFERENCJE VARCHAR(255), PRODUCENT VARCHAR (255), price VARCHAR(255));", cs);
        cs.Open();
        command.ExecuteNonQuery();
        cs.Close();
        for (int i = 1; ; i++)
        {
            //get 2 columns from Excel file with 100 rows and store it in array 
            Excel.Range range = null;
            if (i * 100 > lastRow)
                range = xlWorksheet.Range["A" + ((i - 1) * 100 + 1), "B" + lastRow];
            else
                range = xlWorksheet.Range["A" + ((i - 1) * 100 + 1), "B" + (i * 100)];
            Array myValues = (System.Array)range.Cells.Value2;

            //rewrites Array to DataTable   
            for (int j = 1; j < myValues.Length / 2; j++)
            {
                rowInsert = tabelaInsert.NewRow();
                rowInsert["ref"] = myValues.GetValue(j, 1).ToString();
                rowInsert["price"] = myValues.GetValue(j, 2).ToString();
                rowInsert["prod"] = orgProd;
                tabelaInsert.Rows.Add(rowInsert);
            }


            batchowanieInsert(tabelaInsert, supplierName, tempTable);
            tabelaInsert.Clear();

            if (i * 100 > lastRow)
                break;

        }

        //SQL MERGE 2 tables
        command = new SqlCommand("MERGE ORG AS TARGET"
            + " USING " + tempTable + " AS SOURCE"
            + " ON(TARGET.REFERENCJE = SOURCE.REFERENCJE)"
            + " WHEN MATCHED AND TARGET." + supplierName + " <> SOURCE.CENA THEN"
            + " UPDATE SET TARGET." + supplierName + " = SOURCE.CENA,"
            + " TARGET.PRODUCENT = SOURCE.PRODUCENT"
            + " WHEN NOT MATCHED BY TARGET THEN"
            + " INSERT(REFERENCJE, PRODUCENT, " + supplierName + ")"
            + " VALUES(SOURCE.REFERENCJE, SOURCE.PRODUCENT, SOURCE.CENA);", cs);
        cs.Open();
        command.ExecuteNonQuery();
        cs.Close();

        //DELETING temp table
        command = new SqlCommand("DROP TABLE " + tempTable, cs);
        cs.Open();
        command.ExecuteNonQuery();
        cs.Close();


        //batchowanieInsert function
        public void batchowanieInsert(DataTable tabela, string supplierName, string tempTable)
        {
            da.UpdateBatchSize = tabela.Rows.Count;
            da.InsertCommand = new SqlCommand("INSERT INTO " + tempTable + " (REFERENCJE, PRODUCENT, CENA) VALUES (@REFERENCJE, @PRODUCENT, @CENA)", cs);
            da.InsertCommand.Parameters.Add("@REFERENCJE", SqlDbType.VarChar, 20, "ref");
            da.InsertCommand.Parameters.Add("@PRODUCENT", SqlDbType.VarChar, 20, "prod");
            da.InsertCommand.Parameters.Add("@CENA", SqlDbType.VarChar, 20, "price");
            da.InsertCommand.UpdatedRowSource = UpdateRowSource.None;
            da.Update(tabela);
        }
Piotr Grociak
  • 93
  • 2
  • 8
  • Possible duplicate of [What are good ways to prevent SQL injection?](https://stackoverflow.com/questions/14376473/what-are-good-ways-to-prevent-sql-injection) – mjwills Jul 10 '18 at 13:06
  • Fastest or [safest](https://en.wikipedia.org/wiki/SQL_injection)? – Peter Bons Jul 10 '18 at 13:07
  • 5
    The fastest way (with millions of rows) would _generally_ be to do a `SqlBulkCopy` into a staging table, and then run the `INSERT`s or `UPDATE`s from there into the real table. Also consider using `MERGE` to avoid the need for separate `UPDATE` then `INSERT`. – mjwills Jul 10 '18 at 13:08
  • You can use a tool such as https://github.com/olegil/SqlBulkTools for this purpose. – Magnus Jul 10 '18 at 13:13
  • 1
    Sidenote: don't use `ref` as a variable name, as `ref` is a reserved word in C# – Magnetron Jul 10 '18 at 14:09
  • It might be easier to export the data from Excel into a table in the database and then run a script/select/insert that determines the difference/new rows. Why is C# mandatory? – Emond Jul 10 '18 at 16:28
  • If an answer helped you, please accept it. – Teejay Jul 11 '18 at 10:32
  • Any luck using `SqlBulkCopy` @PiotrGrociak? – mjwills Jul 13 '18 at 04:35
  • Did not use this mehotd @mjwills – Piotr Grociak Jul 17 '18 at 15:24

2 Answers2

1
  1. Import all your data to a temporary table (you can use a multiple INSERT statement, SqlServer performs better with batches of ~100 records)

  2. UPDATE WHERE EXISTS

  3. INSERT WHERE NOT EXISTS

Teejay
  • 7,210
  • 10
  • 45
  • 76
0

Looping through individual Excel cells will always result in inefficiencies. I would suggest either:

  1. Importing the data into a SQL table then perform your check using SELECT statement as opposed to @@ROWCOUNT (Remember @@ROWCOUNT is a message returned from the server and transmitted over to the client side for each row you are inserting). Use @NOCOUN instead to mute server reply.

  2. Or consider using CSV instead of an Excel workbook.

Pholoso Mams
  • 467
  • 1
  • 5
  • 19