2

I'm programmatically importing rows from an excel file to database. The excel file has 10 columns * 30000 rows. I've imported those data in php array and then it is inserted in database.

After uploading file it takes 7-8 minutes to insert all rows in database. I know two ways of inserting the rows

1st method:

Generate dynamic query something like,

INSERT INTO table_name (col1, col2,..) VALUES (row1, row1, ....), (row2, row2, ....), (row3, row3, ....), (row4, row4, ....), ...

and run the query to insert all rows.

2nd method:

$con->prepare("INSERT INTO table_name (col1, col2, ...) VALUES (:col1, :col2, ...)");
foreach ($rows as $row) { // Loop through all rows and insert them
    $result->bindParam(':col1', $val1);
    $result->bindParam(':col2', $val2);
    $result->bindParam(':col3', $val3);
    ...
    ...
    $result->execute();
}

1st seems messy and inefficient and I'm using 2nd method but it insert only 500-700 rows per second and takes 7-8 minutes overall time to insert all rows.

What other methods which are efficient and faster than these ?

EDIT : Do not suggest to import excel file directly to mysql. The data need to be processed before inserting into database.

Bhavesh G
  • 3,000
  • 4
  • 39
  • 66
  • Convert to CSV and then use `LOAD DATA INFILE`. – Sirko May 24 '15 at 09:07
  • See http://stackoverflow.com/questions/18915104/php-import-csv-file-to-mysql-database-using-load-data-infile – Federkun May 24 '15 at 09:08
  • 1
    Can you wrap this up into a transaction? `COMMIT`ing data uses resource, and if you have AUTO COMMIT enabled, it has to do this for each `INSERT` - check [this link](https://dev.mysql.com/doc/refman/5.0/en/commit.html) - you'll have to disable auto commit in PDO too - [check here](http://php.net/manual/en/pdo.transactions.php) - don't forget to `COMMIT` your data at the end if you adopt this method (for what you are trying to do though, I would definitely recommend) – ash May 24 '15 at 10:11
  • Thanks. I'll try `COMMIT` manually in code. – Bhavesh G May 24 '15 at 10:17

2 Answers2

3

The fastest way is to wrap several hundreds of inserts into a transaction and the commit. MySQL will use a single input/output operation of the hard drive to write many records - which is quick.

$con->prepare("INSERT INTO table_name (col1, col2, ...) VALUES (:col1, :col2, ...)");

$inserts = 1000;
$counter = 0;

foreach ($rows as $row) { // Loop through all rows and insert them

    if($counter === 0 && !$con->inTransaction()) {
        $con->beginTransaction();
    }

    $result->bindParam(':col1', $val1);
    $result->bindParam(':col2', $val2);
    $result->bindParam(':col3', $val3);
    ...
    ...
    $result->execute();

    $counter++;

    if($counter === $inserts) {
        $con->commit();
    }    
}

if($con->inTransaction()) {
    $con->commit();
    $counter = 0;
}

The code above should commit after every 1000 inserts performed. I haven't tested it so it most likely contains errors but its purpose is to illustrate how to wrap 1000 inserts in a transaction and commit it.

N.B.
  • 13,688
  • 3
  • 45
  • 55
  • 1
    `if( ($counter % $inserts ) == 0 )` should use modulus, otherwise it will only execute once.. Something like this: `if(($counter % $inserts)==0) { ...` – The Silencer Jan 31 '16 at 22:44
1

Try wrapping up your SQL into a TRANSACTION instead, COMMITING your data at the end. This will free up a lot of resource, as COMMITING data does quite a bit (it tells your DB to save the data you're INSERTING, and REINDEX - assuming you have them).

Also, try and make sure your connection to the database is Pooled - this means that you won't go and grab a new connection from the DB each time, but instead use the same one.

The risk of using a transaction though, is that if one error occurs in your dataset (preventing the INSERT), it will rollback the entire dataset.

Something like this could work..

<?php
  try {
    $dbh = new PDO('odbc:SAMPLE', 'db2inst1', 'ibmdb2', 
      array(PDO::ATTR_PERSISTENT => true)); // POOLED
      echo "Connected\n";
  } catch (Exception $e) {
    die("Unable to connect: " . $e->getMessage());
  }

  try {  
    $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

    $dbh->beginTransaction();
    $dbh->prepare("INSERT INTO table_name (col1, col2, ...) VALUES (:col1, :col2, ...)");
    foreach ($rows as $row) { // Loop through all rows and insert them

      // I am not sure where you define $result
      // Review in your implementation if you use
      $result->bindParam(':col1', $val1);
      $result->bindParam(':col2', $val2);
      $result->bindParam(':col3', $val3);
      ...
      ...
      $result->execute();
    }

  } catch (Exception $e) {
    $dbh->rollBack();
    echo "Failed: " . $e->getMessage();
  }


  $dbh->commit();

?>
ash
  • 1,224
  • 3
  • 26
  • 46
  • @N.B. answer is much better - still, consider POOLING the connection though. – ash May 24 '15 at 10:25
  • Does it make any difference if I don't use `array(PDO::ATTR_PERSISTENT => true)); ` ? Because I'm using a database configuration file and I guess not all queries need it. – Bhavesh G May 24 '15 at 10:31
  • You don't have to in the example above, as it'll be using the same connection anyway - but, if you were to use your original example in your question, then I would consider it, yes. – ash May 24 '15 at 10:32
  • 1
    A good post [here](http://stackoverflow.com/questions/9736188/mysql-persistent-connection-vs-connection-pooling) to have a look at on SO. Also [here](http://en.wikipedia.org/wiki/Connection_pool) on Wikipedia – ash May 24 '15 at 10:34