2

I have a C++ program that insert about a million of records into MS Access DB using OLEDBConnection. To do that, I ran the INSERT INTO query a millions time in order to get the records inserted which take quite a long time.

The data is generated in the program in form of array, will that be any other way that i can load the data into database in one single step to improve the performance?

Thanks!

Loop i use to insert the records currently

for (int i = 0; i < populationSize; i++){
    insertSQL = "INSERT INTO [" + pTableName + "] (" + columnsName + ") VALUES (" + columnsValue[i] + ");";`
    outputDBConn->runSQLEdit(insertSQL);
}

Method that run the SQL query

void DBConnector::runSQLEdit(String^ query){
    SQLCMD = gcnew OleDbCommand( query, dbConnection );
    SQLCMD->CommandTimeout = 30;
    dbConnection->Open();
    SQLCMD->ExecuteNonQuery();
    dbConnection->Close();
    }
Curtis
  • 101,612
  • 66
  • 270
  • 352
Raymond
  • 65
  • 1
  • 8
  • 1
    MS Access and a million of records don't seem to be a good couple. Have you heard of SQLite? – ali_bahoo Dec 22 '11 at 08:13
  • It might be quickest to write the array to a CSV and load that into Access in one step. Make sure you have no indexes on the table into which you are loading the data. – Fionnuala Dec 22 '11 at 11:48

2 Answers2

3

It seems very inefficient to open/close the connection for each insert statement.

The standard approach goes something like:

  1. Open connection.
  2. Start transaction, if supported. (This is often very important for databases with transactions.)
  3. Insert. Repeat this step as needed.
  4. Commit transaction, if supported.
  5. Close connection.

Update: The following does not apply to MS Access. Access does not support inserting multiple rows from a literal. It only supports inserting multiple rows from an existing data source. (Although here is a "workabout" that might work. In any case, the most important thing is likely limiting the number of transactions.)

One more thing that can be done is to build a single insert command that adds multiple records at once. This can be done with either multiple statements or a multi-record insert (if supported). It may or may not be significantly faster than just the above (depends upon other factors like network latency and database engine) and will likely need to be adapted to fit within the restrictions of the database (e.g. might only be feasible for a few hundred records at once). This should only be considered after proper connection/transaction usage as described above.

I wouldn't be surprised if there we already-made "bulk insert" libraries/modules floating about... and I don't use MS Access so I can only hope that the above suggestions were helpful :-)

Happy coding.

Community
  • 1
  • 1
  • Thanks pst, the access time is much shorter after i move the open/close thing out of the loop. Sadly, MS Access only support 1 SQL command a time.... – Raymond Dec 30 '11 at 08:02
  • @Raymond :( I found this [how to insert multiple rows](http://stackoverflow.com/questions/1212943/sql-code-to-insert-multiple-rows-in-ms-access-table), which agrees. Make sure that, like this other post, you are using a *single transaction* (or multiple batched transactions) to avoid the overhead of automatic commits after each INSERT. (If you are only hitting about 60 rows/second I would suspect transactions were limiting the throughput.) –  Dec 30 '11 at 08:14
1

Don't do ONE insertion per command. Change your code to something like, this:

string strSQLCommand;
for (int i = 0; i < populationSize; i++){
strSQLCommand += "INSERT INTO [" + pTableName + "] (" + columnsName + ") VALUES (" + columnsValue[i] + ");";`
}
outputDBConn->runSQLEdit(strSQLCommand );

I'm not sure what's the max buffer size of the command, so do some checks and then get the best value to do some "breaks" at every X inserts.

João Augusto
  • 2,285
  • 24
  • 28
  • Thanks but it seems to me that MS Access does not support more than one SQL command a time so can only do one insert each time..... – Raymond Dec 30 '11 at 08:00