1

We downloaded .osm file from openstreetmaps gis data and converted it into .csv file through osmconvert.exe. The csv file is of 3.5 GB of size. We tried importing it to the database through heidisql. Also tried to import the file into database using below php script

$path = "../../indiacountry.csv";
    $row = 0;
    if (($handle = fopen($path, "r")) !== FALSE) {
        while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
            $row++;
            $data_entries[] = $data ;

        }
        fclose($handle);
    }
    // this you'll have to expand
    foreach($data_entries as $line){

    $ts++;
    if ($ts>0)
    {
    $ft++;
 if(mysql_query("insert into mbrace_resources.street_unit_number_india(id1) values ('".str_replace ("'","",$line [0])."')") or die("the eror ".mysql_error()));

 }

      // $db->execute($line);
    }

When we first tried this script, there was memory_limit error and timeout. We changed memory_limit to 4000MB and set time limit to 0. Then tried the script again, the page was blank and continuously tried to execute the script, but not a single row got inserted into the table.

After going through all of this, we feel the only way forward was to split the csv file into multiple files.

How shall we do it.

Thanks in advance

Husain
  • 119
  • 2
  • 8
  • 1
    Did you try reading the CSV line by line rather than in one big BLOB – RiggsFolly Aug 21 '18 at 14:11
  • Possible duplicate of [Batch file to split .csv file](https://stackoverflow.com/questions/20602869/batch-file-to-split-csv-file) – Ronnie Oosting Aug 21 '18 at 14:14
  • Show us your PHP code used to apply this data to the database. I would bet someone will give you a solution – RiggsFolly Aug 21 '18 at 14:16
  • @RiggsFolly. We tried using fgetcsv to import the csv line by line into the table, but the script doesn't just execute this big file. – Husain Aug 21 '18 at 14:18
  • 1
    Well you would have to run it from the PHP CLI and not via a browser. Did you do that? – RiggsFolly Aug 21 '18 at 14:19
  • Possibly related: https://stackoverflow.com/questions/20125242/import-openstreetmap-data-osm-file-to-a-mysql-database-sql-file – Progrock Aug 21 '18 at 14:21
  • No i tried executing it in the browser. What's PHP cli by the way. I have only heard windows command line until now. – Husain Aug 21 '18 at 14:22
  • Which format are you porting from and to? – Progrock Aug 21 '18 at 14:22
  • Edited the question. – Husain Aug 21 '18 at 14:26
  • Building an array of all of the data prior to inserting it is a waste of time. Insert each row as you read it. Also `mysql_`?! – Nigel Ren Aug 21 '18 at 14:28
  • "but none of them seemed to go through. " isn't a description. What error messages/number did you get? Also, how long is your longest line and what is the file encoding. If the file encoding is different than your LC_LOCALE, it's going to break. Also make sure your delimiter isn't present in the file in inside fields. – Terry Carmen Aug 21 '18 at 14:30
  • You can slurp in a CSV via a mysql import. https://stackoverflow.com/questions/3635166/how-to-import-csv-file-to-mysql-table – Progrock Aug 21 '18 at 14:37

2 Answers2

2

If you are looking for PHP specific solution; here is a simple one you can tweak for your needs. Well, this solution assumes that you do not need to duplicate the header row for each file. You can modify it accordingly to add header row for every part file if needed:

$outputFile = 'indiacountry-part-';
$splitSize = 50000; // 50k records in a one file
$in = fopen('indiacountry.csv', 'r');

$rows = 0;
$fileCount = 1;
$out = null;

while (!feof($in)) {
    if (($rows % $splitSize) == 0) {
        if ($rows > 0) {
            fclose($out);
        }

        $fileCount++;

        // for filenames like indiacountry-part-0001.csv, indiacountry-part-0002.csv etc
        $fileCounterDisplay = sprintf("%04d", $fileCount);

        $fileName = "$outputFile$fileCounterDisplay.csv";
        $out = fopen($fileName, 'w');
    }

    $data = fgetcsv($in);

    if ($data)
        fputcsv($out, $data);

    $rows++;
}

fclose($out);

Now you can programmatically parse every part files 'indiacountry-part-xxxx.csv' and insert it to your table in batches. Read each line and insert it as you read, not as a CLOB.

combomatrix
  • 781
  • 5
  • 9
  • tried your script, but the script loads endlessly. Reason is simple, the file is 3.5 GB. – Husain Aug 21 '18 at 14:44
  • I agree! Obviously this combination (PHP, large data set, linear extraction etc) is a slow and sluggish one. I gave the answer only because you are attempting to fix this in PHP. However, try other languages as well. The logic would largely remain the same. Unix shell scripts can handle this as well. Also, look for SQL clients which usually have CSV import feature, and well designed to do such bulk insertions. Ex: HeidiSQL Cheers! – combomatrix Aug 21 '18 at 15:03
  • I don't use PHPmyadmin for your information. Heidisql has been my sql client for years. Even that gives me "Not Responding error" – Husain Aug 21 '18 at 15:07
2

The script you show is reading the WHOLE .csv file into an in memory array. Its not surprising it wont run that will require at least 3.5gig+ of memory.

Instead read one line from the file and apply it directly to the database.

I am going to ignore the fact you are using the old, dangerous and deprecated mysql_ database extension for now. If you tell me you have access to mysqli_ or PDO I will willingly rewrite this for either of those API's

$path = "../../indiacountry.csv";
$row = 0;
if (($handle = fopen($path, "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $row++;
        $id = str_replace ("'","",$line [0]);
        mysql_query("insert into mbrace_resources.street_unit_number_india 
                    (id1) values ('$id')") 
            or die("the eror ".mysql_error());
    }
    fclose($handle);
}

echo "Finished: Added $row rows";
RiggsFolly
  • 93,638
  • 21
  • 103
  • 149
  • Hey thanks Riggsfolly. Script worked. Regarding mysql, we know it is deprecated. But the problem here is that we have coded our entire application in mysql since half a decade. Converting it into mysqli or pdo is going to take a bit of work.Surely shall convert the scripts into mysqli sooner. Anyway thanks for the concern. – Husain Aug 23 '18 at 03:11