0

I have excel data more than 5k rows and 17 columns, I use the nested loop technique in php, but this takes a long time, to process the data using the xls file format takes 45 minutes, while using the csv file format takes 30 minutes , is there a technique to speed up uploading files from excel to the database (I use Postgresql).

I use a nested loop because how many columns depend on the parameters, and for the INSERT or UPDATE process to the database also depends on the parameters.

Here is my code for the import process

    <?php
    $row = 5000; // estimated row
    $col = 17; // estimated col
    for($i=1; $i<=$row; $i+=1){
        for($j=1; $j<=$col; $j+=1){
            $custno = $custno = $sheetData[$i][0];
            $getId = "SELECT id from data WHERE 'custno' = $custno";

            if($getId){
               $update = "UPDATE data SET address = 'address 1' WHERE custno = $custno";
            }else{
               $insert = "INSERT INTO data (address) VALUES (address jon);
            }
        }
    }

I use the PhpSpreadsheet library

  • Have a look at https://stackoverflow.com/questions/33095610/reading-very-large-more-than-100mb-excel-files-in-php – Amanjot Kaur Dec 31 '19 at 07:38
  • Is your spreadsheet actually 5k rows and 17 columns or is this just a guess? – Nigel Ren Dec 31 '19 at 07:49
  • @AmanjotKaur thank you for giving a reference, but in my case it doesn't crash, I need a nested loop technique that can speed up the process – Yonathan Rizky Nathanael Dec 31 '19 at 07:51
  • @NigelRen Yes. for now it's that much, but going forward 10k rows and maybe more than 5 columns or only 5 columns – Yonathan Rizky Nathanael Dec 31 '19 at 07:53
  • @YonathanRizkyNathanael This is what I am talking about. That library is defined to increase the excel reading in memory. Read the post I shared carefully. Also, your page can also crash due to various reasons like f the server is slow or the internet is very slow... A lot of other reasons also... Because you are dealing with a vast amount of data. Do not ignore anything. – Amanjot Kaur Dec 31 '19 at 08:01
  • 1
    Without actually seeing what is done in your loop it is not really possible to guess where the issue might be but there is a very good chance that doing queries (even one) in a loop that size will be a major bottleneck. If you [edit] your post and include the code in the loop along with sample data and the expected output we may be able to help. – Dave Dec 31 '19 at 11:56

2 Answers2

0

First, try to find out what is the root of the issue, is it because operating over the file is slow or there are too many SQL queries being executed in the meantime?

Bear in mind that running queries in the loop is always asking for performance trouble. Maybe you can avoid that by asking for needed data before processing the file? You may not be able to define which data are needed on that step but fetching more than you need could be still faster than making separate queries one by one. Also, I would like to encourage you to limit INSERT or UPDATE queries. They are usually slower than the SELECT one. Try to collect data for database write operations and run it once after the loop.

For CSV operations I would prefer basic php methods like fgetcsv() and str_getcsv() than the separate library as long as the file is not overcomplicated. If you are keen to check some alternatives for PhpSpreadsheet take a look at Spout by box.com, it looks promising but I have never used that.

I'm sure that you can improve your performance by using PHP Genrators, they are perfect everytime you have to read a file content. Here you have some more links:

  1. https://www.sitepoint.com/memory-performance-boosts-with-generators-and-nikiciter/

  2. https://www.sitepoint.com/generators-in-php/

  3. https://riptutorial.com/php/example/5441/reading-a-large-file-with-a-generator/

kkreft
  • 46
  • 3
0

If not using php for this operation is an option for your, try exporting this spreadsheet as CSV and importing the file using COPY. It won't take more than a few seconds.

If your database is installed locally you just need to execute COPY in a client of your choice, e.g. pgAdmin. Check this answer for more information.

COPY your_table FROM '/home/user/file.csv' DELIMITER ',' CSV HEADER;

Keep in mind that the user postgres in your system must have the necessary permissions to access the CSV file. Check how to do that in your operating system, e.g. chown in Linux.

In case your database is installed in a remote server, you have to use the STDIN facility of COPY via psql

$ cat file.csv | psql your_db -c "COPY your_table FROM STDIN;"
Jim Jones
  • 18,404
  • 3
  • 35
  • 44