7

I have a DataTable which contains hundreds of thousands of rows. Through the course of my procedure, I am adding a few thousand rows to this DataTable, which need to be added to the database too. Rather than creating an INSERT statement for each record, I would like to insert them as quickly as possible. The MySQL LOAD INTO command is not suitable, as I do not want to involve any external CSV files.

What I have done so far, is use a MySqlDataAdapter and call the 'Update' method with only the insertion changes, like so:

MySqlCommandBuilder commandBuilder = new MySqlCommandBuilder(adapter);
adapter.InsertCommand = commandBuilder.GetInsertCommand();
adapter.Update(myDataTable);

This is also running painfully slow, so I suspect that they are being inserted one row at a time, too. What options do I have? Is building a long INSERT statement with all the values included in it, the only way to go?

Dot NET
  • 4,891
  • 13
  • 55
  • 98

5 Answers5

7

Insert values like that :-

INSERT INTO tbl_name
    (a,b,c)
VALUES
    (1,2,3),
    (4,5,6),
    (7,8,9);

To optimize insert speed, combine many small operations into a single large operation.

Abhishek Sharma
  • 6,689
  • 1
  • 14
  • 20
  • How can this be done with parameterized queries though? And would it be able to support hundreds of thousands of records? – Dot NET Nov 03 '15 at 11:22
  • Do you have indexes on the columns ? – avrono Nov 03 '15 at 11:24
  • Not all of them are indexed @avrono – Dot NET Nov 03 '15 at 11:25
  • 1
    You can use parameterised queries for this but the code becomes messier. I wouldn't use very large numbers of records on each INSERT as the benefits of this method reduce with larger records. I generally go for ~250 records on each insert. In PHP I use a class to do the inserts, passing it rows of data to insert. The class then takes care of building the query and when to insert, with final records being inserted when the objects destructor is called. – Kickstart Nov 03 '15 at 11:43
  • I strongly advice you test this/comparing to create a csv file and delete it. Executing of ~ 400-1600 queries will be not faster than creating a file on server side use the MySqlBulkLoader and after that deleting it. Again this should be mesure, – mybirthname Nov 03 '15 at 11:52
2

The only solution which I see is:

1) Convert DataTable to csv-> you can google it.

2) Save it on server side in temp directory.

3)Using MySqlBulkLoader here is link to article about it. Load the file saved in the temp directory.

4) After that delete the file from the temp directory.

mybirthname
  • 17,949
  • 3
  • 31
  • 55
  • While I knew this option all along, this is the approach I ended up taking as there was no other obvious way to do what I want to achieve. So thanks for confirming that this is the way to go. – Dot NET Nov 06 '15 at 10:00
0

I dont know if this is a good method with parameters, but it works well The method receives a list of "ParamDbList"(Collection of ParamDB) and insert rows every 1000 registers or 1900 parameters(limit of 2000). Just adapt this for your drive

  public bool InsertBatch(System.Collections.Generic.List<ParamDbLIST> dados, string tabela)
    {
        if (dados.Count == 0)
            return true;

        string campos = "";
        dados[0].ForEach(delegate(ParamDB p)
        {
            campos += (campos == "" ? "" : ", ") + "@" + p.sNOME + "#N#";
        });

        bool resultado = true;
        //Insere de 999 a 999, que é o máximo q o sql server permite por vez
        //Maximo de 2000 parametros
        int k = 0;
        while (k < dados.Count)
        {
            this.sql = new StringBuilder();
            List<String> vals = new List<string>();
            ParamDbLIST parametros_insert = new ParamDbLIST();
            int c_sqls = 0;
            int c_parametros = 0;
            while (k < dados.Count && c_sqls < 1000 && c_parametros < 1900)
            {
                c_sqls++;
                vals.Add("(" + campos.Replace("#N#", c_sqls.ToString()) + ")");
                foreach (ParamDB p in dados[k])
                {

                    p.sNOME += c_sqls.ToString();
                    parametros_insert.Add(p);
                    c_parametros++;
                }
                k++;
            }

            this.sql.Append("INSERT INTO " + tabela + "(" + campos.Replace("#N#", String.Empty).Replace("@", String.Empty) + ") VALUES " + String.Join(",", vals));

            resultado = resultado && this.RunSQL(sql.ToString(), parametros_insert);

        }

        return resultado;
    }






public class ParamDbLIST : System.Collections.ObjectModel.Collection<ParamDB>
{/*I have other stuff here, but this will work*/}

   public class ParamDB
{
    public string sNOME { get; set; }
    public Object sVALOR { get; set; }}

Remember the method INSERT INTO tbl_name (a,b,c) VALUES (1,2,3), (4,5,6), (7,8,9); has a limit of 1000 rows per command.

I think a good thing to do here would be to use transaction(for secure)

The method you should change is RunSQL

If this method can be optimized, please let me know

Charles Stein
  • 126
  • 1
  • 7
0

Not sure about MySQL but I found with sql server by far the fastest way to insert was similar to @Rahautos but to separate the value list in to separate queries. I know it sounds odd but this improved the speed 10 fold from 1200 to 12000 inserts per seconds. not sure what the difference was. And the transaction helps immensely also.

INSERT INTO tbl_name
    (a,b,c)
VALUES
    (1,2,3);
INSERT INTO tbl_name
    (a,b,c);
VALUES
    (4,5,6);
INSERT INTO tbl_name
    (a,b,c);
VALUES
    (7,8,9);

Example code

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            var constring = (new SqlConnectionStringBuilder
            {
                DataSource = "someserver",
                InitialCatalog = "12trunk",
                IntegratedSecurity = true
            }).ToString();
            using (var con = new SqlConnection(constring))
            {
                con.Open();
                using (var trans = con.BeginTransaction(isolationLevel: System.Data.IsolationLevel.ReadUncommitted) as SqlTransaction)
                using (var cmd = new SqlCommand())
                {
                    cmd.Transaction = trans;
                    cmd.Connection = con;
                    var start = DateTime.Now;
                    Console.WriteLine("Start = " + start);
                    const int inserts = 100000;
                    var builder = new StringBuilder();
                    cmd.CommandText = "delete from test";                    
                    for (int i = 0; i < inserts; i++)
                    {
                        Guid[] guids = new Guid[7];
                        for (int j = 0; j < 7; j++)
                        {
                            guids[j] = Guid.NewGuid();
                        }
                        var sql = $"insert into test (f0, f1, f2, f3, f4, f5, f6) values ('{guids[0]}', '{guids[1]}', '{guids[2]}','{guids[3]}', '{guids[4]}', '{guids[5]}', '{guids[6]}');\n";
                        builder.Append(sql);
                        if (i % 1000 == 0)
                        {
                            cmd.CommandText = builder.ToString();
                            cmd.ExecuteNonQuery();
                            builder.Clear();
                        }

                    }
                    cmd.CommandText = builder.ToString();
                    cmd.ExecuteNonQuery();
                    trans.Commit();
                    var ms = (DateTime.Now - start).TotalMilliseconds;
                    Console.WriteLine("Ms to run = " + ms);
                    Console.WriteLine("inserts per sec = " + inserts / (ms / 1000));
                    Console.ReadKey();
                }
            }
        }
    }
}
Justin
  • 3,255
  • 3
  • 22
  • 20
0

I'm Not Sure for MYSQl. But in MSSql Server In this case we use XML. We Create a xml with a lot of data then we use Insert With Xml in DataBase.

INSERT INTO [TBL_USER_INFO](NAME, EMAIL)
   SELECT
      Result.value('value','varchar(25)') as NAME,
      Result.value('value','varchar(50)') as EMAIL
   FROM 
      @input.nodes('/EmployeList/Employee') AS ABCD(Result)