2

I am using a C# application, in order to manage a mySQL database.

What I want to do is:

  • Read some records.
  • Run some functions to calculate "stuff".
  • Insert "stuff" to database.

In order to calculate n-th "stuff", I must have already calculated (n-1)-th "stuff".

This is what I do:

Declare:

static MySqlCommand cmd;
static MySqlDataReader dr;

My main loop is like following:

for (...)
{
    dr.Close();
    cmd.CommandText = "insert into....";
    dr = cmd.ExecuteReader();
}

This is taking way too long. Total number of rows to be inserted is about 2.5M.

When I use mySql database in regular server, it takes about 100-150 hours. When I use a localhost database, it takes about 50h.

I think there should be a quicker way. My thoughts:

  • I think that now i connect to db and disconnect from db every time i loop. Is it true?
  • I could i create a CommandText that contains for example 100 queries (separated by semi-colon). Is this possible?
  • Instead of executing the queries, output them in a text file (file will be about 300MB). Then insert them into db using phpMyAdmin (Bonus question: I'm using phpMyAdmin. Is this ok? Is there a better (maybe not web) interface?)
DROP TABLE users
  • 1,955
  • 14
  • 26
Thanos Darkadakis
  • 1,669
  • 2
  • 18
  • 30
  • Could you do some sort of bulk insert in one command? – DROP TABLE users Jan 02 '14 at 17:14
  • @DROPtableusers bulk insert? you mean my 2nd thought? – Thanos Darkadakis Jan 02 '14 at 17:15
  • @SLaks that might help slightly, but he's still performing 2.5M separate inserts. – Dave Zych Jan 02 '14 at 17:15
  • @SLaks will it be that quicker? i mean like 1h instead of 100h? – Thanos Darkadakis Jan 02 '14 at 17:15
  • What is calulate doing that can't be done in sql? If it can you do the olot on the server and you don't have to take the hit of pulling mullions of records and pushing how ever many back up. – Tony Hopkinson Jan 02 '14 at 17:17
  • Instead of hitting db 2.5 million times, I think you can create a multiple insert for certain K of query and ExecuteNonQuery() and then again build K's of query and Execute – Sandip Bantawa Jan 02 '14 at 17:18
  • @ThanosDarkadakis a little different than what you put there I think. something like this `INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);` could be quicker, definitely not processing 1000's of separate queries depending on how it is handled by mysql. – DROP TABLE users Jan 02 '14 at 17:18
  • @ThanosDarkadakis Building a bulk query would be much better - I bet most of the runtime is sending the command to the database rather than the actual time spent running the insert command. You can probably run batches of 10,000+, but if you do too many at a time you'll run into memory issues in mysql. – Dave Zych Jan 02 '14 at 17:18
  • Is there any reason why you're using a reader to do inserts? – SQLMason Jan 02 '14 at 17:44
  • possible duplicate of [How to do a batch insert in MySQL](http://stackoverflow.com/questions/5526917/how-to-do-a-batch-insert-in-mysql) – SQLMason Jan 02 '14 at 17:45
  • If you can do the math on the SQL server, use a stored procedure. Constantly opening and closing things is nuts. – 3Dave Jan 02 '14 at 18:15

2 Answers2

2

Try using a bulk insert. I found this syntax here. And then use ExecuteNonQuery() as SLaks suggested in the comments. Those combined may speed it up a good bit.

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Community
  • 1
  • 1
DROP TABLE users
  • 1,955
  • 14
  • 26
0

It's possible you are using InnoDB as an access method. In this case you should try wrapping every hundred or so rows of INSERT operations in a transaction. When I have had to handle this kind of application, it's made a huge difference. To do this, structure your code like so:

    MySqlCommand commit;
    start.CommandText = "START TRANSACTION";
    MySqlCommand commit;
    commit.CommandText = "COMMIT";
    int bunchSize = 100;
    int bunch = 0;

    start.ExecuteNonQuery();   /* start the first bunch transaction */

    bunch = bunchsize;
    for(/*whatever loop conditions you need*/) {

        /* whatever you need to do */

        /* your insert statement */
        if (--bunchsize <= 0) {
             commit.ExecuteNonQuery();   /* end one bunch transaction */
             start.ExecuteNonQuery();    /* and begin the next */
             bunchsize = bunch;
        } 
    }
    commit.ExecuteNonQuery();   /* end the last bunch transaction */

It is also possible that the table to which you're inserting megarows has lots of indexes. In this case you can speed things up by beginning your series of INSERTs with

 SET unique_checks=0;
 SET foreign_key_checks=0;
 ALTER TABLE tbl_name DISABLE KEYS;

and ending it with this sequence.

 ALTER TABLE tbl_name ENABLE KEYS;
 SET unique_checks=1;
 SET foreign_key_checks=1;

You must take great care in your software to avoid inserting rows that would be rejected as duplicates when you use this technique, because the ENABLE KEYS operation will not work in that case.

Read this for more information: http://dev.mysql.com/doc/refman/5.5/en/optimizing-innodb-bulk-data-loading.html

O. Jones
  • 103,626
  • 17
  • 118
  • 172