3

I have text file with 2,000,000 line and want to insert each line in a database row so, inserting 2,000,000 record into sql server using php file. What is the fastest way to do that?

For testing I am using this code:

for ( $i=1 ; $i<=2000000 ; $i++)
    {
        $sql = "INSERT INTO BlastSequenceDim (Seq_id) VALUES ('$i')";
        $stmt = sqlsrv_query( $conn, $sql);
        if( $stmt === false ) 
        {
            die( print_r( sqlsrv_errors(), true));
        }
    }

But, it takes a lot of time. Can this query executes in seconds?

Thanks,

Ray
  • 2,713
  • 3
  • 29
  • 61
Alaa
  • 185
  • 5
  • 14
  • What does this text file contain? – Charlotte Dunois Mar 25 '16 at 14:35
  • 1
    I don't think there is a way to insert 2 million records in seconds while going through the connection and overhead associated but it will be faster to insert in batches rather than one at a time. See http://stackoverflow.com/questions/5526917/how-to-do-a-batch-insert-in-mysql for details. – Technoh Mar 25 '16 at 14:35
  • This should not be a real problem, probably few seconds to run it. The fastest way is per block of multiple insert if you can.. – Loenix Mar 25 '16 at 14:36
  • @David Vogel the connection is opened once only outside the for loop – Alaa Mar 25 '16 at 14:40

4 Answers4

3

Please use Bulk insert this will significantly improve the performance and save time .

learn more about Bulk insert query in SQL

You can also save time by cutting out PHP and directly accessing the SQL database, this would be common practice to initially populate a db.

Another link

C_B
  • 2,620
  • 3
  • 23
  • 45
Shadi Zidan
  • 154
  • 5
  • do you have a quick example of this bulk insert? I will read the link but if you have a simple example it will be better – Alaa Mar 25 '16 at 14:42
  • Instead of copy Paste the example . I would prefer that you read the following article it should be the example you are looking for. [link] http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file – Shadi Zidan Mar 25 '16 at 14:50
  • I want to test it first. I do not have the text file now. One of my application tasks is to generate this text file then insert it into the database. I am now in testing phase. I want to know what the time it will take for inserting millions record into sql server. and you can see in my code I just inserting "i" value because I am in testing phase before going deeply. – Alaa Mar 25 '16 at 14:58
  • Dear Alaa . This is the most efficient way to insert a huge number of rows. If you are really interested in a testing please find it in this link. https://venzi.wordpress.com/2012/03/31/loading-data-fast-regular-insert-vs-bulk-insert/ – Shadi Zidan Mar 25 '16 at 15:06
  • Please mark as correct answer if you think this helped. Thanks – Shadi Zidan Mar 27 '16 at 07:55
2

Instead of running each query individually. Concatenate the queries together and run it all at once.

for ( $i=1 ; $i<=2000000 ; $i++)
{
    $sql .= "INSERT INTO BlastSequenceDim (Seq_id) VALUES ('$i');";
}

$stmt = sqlsrv_query( $conn, $sql);
if( $stmt === false ) 
{
    die( print_r( sqlsrv_errors(), true));
}
Ray
  • 2,713
  • 3
  • 29
  • 61
  • although this is a solution, there's more efficient ways. – C_B Mar 25 '16 at 14:39
  • @Ray I am trying your code and got this error: Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 225778224 bytes) in C:\inetpub\wwwroot\webclient\saveResult.php on line 106. Line 106 is : $stmt = sqlsrv_query( $conn, $sql); – Alaa Mar 25 '16 at 14:49
  • @Alaa Not really a good solution - I would strongly disadvise to implement it. The solution has an end which is determinate by the size of the file. You can for sure increase the PHP memory but what then? – Peter VARGA Mar 25 '16 at 16:57
0

Usually the bottleneck is that PHP is not running multi-threaded. This means in your case one process is reading sequentially the file and inserts row by row even your server/PC has 8 or more cores.

I had the same problem that I had to insert lot of values into memcache keys and I solved it with php pthread.

It is for sure more work but when it is a repeating task [as it was in my case] then it is definitely worth to implement it with pthread.

As a hint I would suggest the total amount of started threads should not be > than number_of_cpu_cores x 1.5.

There is only one small problem: How do you coordinate the reading from the text file. Here I would consider to pass two numbers to each thread:

  • Start at row number X
  • The offset which is added to X in order you skip the rows read by the other threads.

You will be surprised about the performance gain! Combining it with other answers for your question it will be unbeatable...

Peter VARGA
  • 4,780
  • 3
  • 39
  • 75
  • I believe this answer is misleading. In most situations, if you are doing "read line; insert into SQL" in a loop, the PHP process is spending more than 95% of the time idle. After each insert PHP has to wait for the data to get to the SQL server (possibly over the network) and for SQL to commit it to disk. Running more threads in parallel *will* improve throughput, but the "best" number of threads is unlikely to be related to core count, since PHP is not CPU bound in this case. – 9072997 May 19 '22 at 15:37
0

One way to do this is to encode the entire data set as JSON and send in in a single query. There are a couple different ways to achieve this. The Microsoft docs discuss one here. They insert JSON objects in the example (so if you wanted to do it in bulk you would use an array of objects).

I will instead use an array of arrays. This makes for slightly less readable SQL, but decreases overhead. Also, I am accessing MSSQL via PDO, but you could just as easily use sqlsrv.

$stmt = $db->prepare(<<<EndSQL
CREATE TABLE #exampleTable (
    A VARCHAR(200),
    B VARCHAR(200)
)

INSERT INTO #exampleTable
SELECT * FROM OPENJSON(?) WITH (
    A VARCHAR(200) '$[0]',
    B VARCHAR(200) '$[1]'
);

DROP TABLE #exampleTable;
EndSQL);

$data = [];
for($i = 0; $i < 100000; $i++) {
    $data[] = [
        'A' => 'A-val-' . $i,
        'B' => 'B-val-' . $i
    ];
}
$stmt->execute([json_encode($data)]);

This is similar to the strategy where you stick a bunch of inserts in the same statement, but it allows you to use a static SQL statement regardless of how many inserts you have.

INSERT INTO #exampleTable VALUES
(?, ?),
(?, ?),
(?, ?),
(?, ?),
(?, ?),
(?, ?),
(?, ?),
...

If you do use this, the next limit you might run into is memory. This loads everything into PHP's memory before sending it to SQL. There are 2 ways to get around this: do batches of ~10,000 lines or stream JSON to SQL Server as you generate it.

9072997
  • 849
  • 7
  • 21