0

I am trying to speed up the performance of this PHP code. I am running it through the browser on XAMPP using Apache and MYSQL on there.

My problem is that I am reading a very large 2GB JSON file. I cannot read it in one go as I do not have enough memory and thus I have to read it line by line

The code reads the JSON line by line. For each line (chunk of JSON data) a function (Function parse_item) is called which decodes the line of the JSON and then opens up the mysql database and inserts selected data from the decoded JSON line and then closes the MySQL database. I know its the inserting to the database that is causing my performance problems but not sure what to do.(ie when I remove the database and just echo the decoded JSON it is much faster)

The code (an abbreviated slice) works but is very slow...................

Can anybody help? I am new to PHP so any explanation in simple terms would be gratefully received.

Also .......if I ran this from the command line (ie loaded PHP to my PC) would it run much quicker??

Thanks John

//Parse Function
function parse_item($string) {

$data = json_decode($string);


$uid=($data->JsonScheduleV1->CIF_train_uid);
$stp=($data->JsonScheduleV1->CIF_stp_indicator);                                                      
$head=($data->JsonScheduleV1->schedule_segment->signalling_id);       
$startlocation=($data->JsonScheduleV1->schedule_segment->schedule_location[0]->tiploc_code);

require('connect_db.php');
mysqli_select_db($mysql_link,"timetable");
mysqli_query($mysql_link,"INSERT INTO railstp (uid,stp,head,startlocation)
  VALUES ('$uid','$stp','$head','$startlocation')");
mysqli_close($mysql_link);

if (is_null($data)) {
   die("Json decoding failed with error: ". json_last_error());
}

}


$file = 'CIF_PA_TOC_FULL_DAILY_toc-full';

// Slices up the JSON into lines and then
parse_item($string);



//

EDIT

I have a InnoDB database There are 20 keys - 2 are integers , 2 are dates and the rest text Not sure how many lines but guess over a 1,000 -not been able to run it long enough

user2635961
  • 379
  • 3
  • 19
  • You should show the structure of your table. (If it is MyISAM or InnoDB, what keys, ...). How many lines do you have? With one invocation of the script you parse the whole file, calling `parse_item` for every line? – t.niese Sep 08 '13 at 17:17
  • Try using batch insert, maybe 100 - 1000 records at a time? http://stackoverflow.com/questions/5526917/how-to-do-a-batch-insert-in-mysql – SamV Sep 08 '13 at 17:29
  • don't open/close your DB connection for each line. Compile a mysql statement and re-use it. – Reactgular Sep 08 '13 at 17:29
  • Mathew - is there anywhere I can learn how to compile a statement and re use it. I get an error when I have tried to open and close the DB outside of the function shell. Thanks – user2635961 Sep 08 '13 at 17:38
  • Hi Sam. Thanks. I will read the link – user2635961 Sep 08 '13 at 17:39
  • Executing long-running tasks like this in the context of a web request is usually a bad idea - most web servers have a built-in limitation on execution time for web pages, after which they will kill the request, leaving you with only part of your file imported. – Neville Kuyt Sep 08 '13 at 21:35

1 Answers1

1

Create MySQL connection and close it per parse_item() function call will waste your script time and also your if statement doest have enough function!, so I'd rewrite your code:

require('connect_db.php');
mysqli_select_db($mysql_link,"timetable");
function parse_item($string,&$mysql_link) {
  $data = json_decode($string);
  if (is_null($data)) {
     mysqli_close($mysql_link);die("Json decoding failed with error: ". json_last_error());
  }
  $uid=($data->JsonScheduleV1->CIF_train_uid);
  $stp=($data->JsonScheduleV1->CIF_stp_indicator);                                                      
  $head=($data->JsonScheduleV1->schedule_segment->signalling_id);       
  $startlocation=($data->JsonScheduleV1->schedule_segment->schedule_location[0]->tiploc_code);
  mysqli_query($mysql_link,"INSERT INTO railstp (uid,stp,head,startlocation)
                            VALUES ('$uid','$stp','$head','$startlocation')");
}


$file = 'CIF_PA_TOC_FULL_DAILY_toc-full';
$handle = fopen("c:\\folder\$file" , "r");
while(!feof($handle)){
   $string=fgets($handle); 
   parse_item($string,$mysql_link);       //REMEMBER: to pass your $mysql_link variable!
}
fclose($handle);
mysqli_close($mysql_link);

I hope this help you.

01e
  • 691
  • 3
  • 14
  • Hi Saber. That's brilliant. Big increase in speed. The key learning was you showing me how to write the code to ensure I did not have to keep opening and closing the database. I have a slightly different way of reading the file (line by line) before passing it to the function. Your answer perfectly helped. Thank you – user2635961 Sep 08 '13 at 18:51
  • @user2635961 yw ;), I used file handle to show you how you can use `parse_item()`. _don't forget to accept and voteup proper and your suitable answer._ Good luck – 01e Sep 09 '13 at 06:37
  • Thanks Saber. Tried to vote up but looks like I need a reputation of 15 before doing so. Will do that when I get to 15. – user2635961 Sep 09 '13 at 20:42