4

I have a slight problem with an application I am working on. The application is used as a developer tool to dump tables from a database in a MySQL server to a JSON file which the devs grab by using the Unix curl command. So far the databases we've been using are relatively small tables(2GB or less) however recently we've moved into another stage of testing that use fully populated tables (40GB+) and my simple PHP script breaks. Here's my script:

[<?php 

$database = $_GET['db'];

ini_set('display_errors', 'On');
error_reporting(E_ALL);

# Connect
mysql_connect('localhost', 'root', 'root') or die('Could not connect: ' . mysql_error());

# Choose a database
mysql_select_db('user_recording') or die('Could not select database');

# Perform database query
$query = "SELECT * from `".$database."`";
$result = mysql_query($query) or die('Query failed: ' . mysql_error());

while ($row = mysql_fetch_object($result)) {
   echo json_encode($row);
   echo ",";
}

?>] 

My question to you is what can I do to make this script better about handling larger database dumps.

Chris Maness
  • 1,682
  • 3
  • 22
  • 40
  • Is the script simply timing out? If so, might look into what your default script time limit is (max_execution_time in php.ini) or just test using `set_time_limit(0)`. – user428517 Oct 19 '12 at 15:03
  • If you call your PHP Script via Webbrowser there is a timeout after 30 seconds. Is that the reason why it breaks? If so, call it via a console. But why aren't you calling the [mysqldump](http://dev.mysql.com/doc/refman/5.5/en/mysqldump.html) command? – PKeidel Oct 19 '12 at 15:05
  • It won't solve your problem, but I suggest renaming `$database` to `$table` (it will be more logical). – Jocelyn Oct 19 '12 at 15:08
  • 1
    "...and my simple PHP script breaks..." - what is the error? – Aleksandar Vucetic Oct 19 '12 at 15:11
  • I believe it just times out, there's no error given. Thanks for noticing that Jocelyn, this was thrown together very quickly. – Chris Maness Oct 19 '12 at 15:14
  • There probably is an error message saved in the webserver error log. – Jocelyn Oct 19 '12 at 15:15
  • echo json_encode($row); - so, are you running this php script in the command line and redirecting to a file, or are you calling this script directly through apache? If second, then you have request timeout on apache. – Aleksandar Vucetic Oct 19 '12 at 15:18
  • If it's just time out did you tried putting set_time_limit(0) on top of your script? and by the way what do you do by doing "echo" of all database content? Are you trying to write into a file or what? – deej Oct 19 '12 at 15:30
  • looks like this is something dumped in the middle of a javascript code block. he's putting it all into an array (hence the square brackets around the php tags). – user428517 Oct 19 '12 at 17:11

3 Answers3

3

This is what I think that the problem is:

you are using mysql_query. mysql_query buffers data in memory and then mysql_fetch_object just fetches that data from the memory. For very large tables, you just don't have enough memory (most likely you are getting all 40G of rows into that one single call).

Use mysql_unbuffered_query instead. More info here on MySQL performance blog There you can find some other possible causes for this behavior.

Aleksandar Vucetic
  • 14,715
  • 9
  • 53
  • 56
1

I'd say just let mysql do it for you, not php:

SELECT 
 CONCAT("[",
      GROUP_CONCAT(
           CONCAT("{field_a:'",field_a,"'"),
           CONCAT(",field_b:'",field_b),"'}")
      )
 ,"]") 
AS json FROM table;

it should generates something like this:

[
    {field_a:'aaa',field_b:'bbb'},
    {field_a:'AAA',field_b:'BBB'}
]
Mahdi
  • 9,247
  • 9
  • 53
  • 74
0

You might have a problem with MySQL buffering. But, you might also have other problems. If your script is timing out, try disabling the timeout with set_time_limit(0). That's a simple fix, so if that doesn't work, you could also try:

  1. Try dumping your database offline, then transfer it via script or just direct http. You might try making a first PHP script call a shell script which calls a PHP-CLI script that dumps your database to text. Then, just pull the database via HTTP.
  2. Try having your script dump part of a database (the rows 0 through N, N+1 through 2N, etc).
  3. Are you using compression on your http connections? If your lag is transfer time (not script processing time), then speeding up the transfer via compression might help. If it's the data transfer, JSON might not be the best way to transfer the data. Maybe it is. I don't know. This question might help you: Preferred method to store PHP arrays (json_encode vs serialize)

Also, for options 1 and 3, you might try looking at this question:

What is the best way to handle this: large download via PHP + slow connection from client = script timeout before file is completely downloaded

Community
  • 1
  • 1
Matt
  • 556
  • 1
  • 4
  • 18