So, I have this database with 435,453 rows of data. And I have 271 different companies that need specific information from it, so I am currently trying to aggregate it, by selecting all the data with a MySQL query, which returns all rows in 1.28 seconds, fine.
Then I want to traverse them and write them to files (using fputcsv()) according to each company. So I start by using fopen() 271 times to create the files, and then for each returned row I see which companies that will have data in that row and write to the files accordingly.
Now, using memory_usage() I can see that memory usage is on a steady 6Mb during the entire process. And by using microtime() I am timing each iteration of the while() that is me reading from the DB.
And the end result is that each iteration takes about 0.00001 seconds, so for each row in the DB, it takes 0.00001 seconds to determine which companies should have each row and writing it to those files.
But something is messed up, because after 100 minutes, the process is still not done, and the while() advances about 100 rows every second. And if my math is correct, traversing 435,453 rows where each takes 0.00001 seconds should take about 4 seconds.
Here is the code, or the part that is taking so long:
$q=mysql_query(bq($query)) or print mysql_error();
while ($r=mysql_fetch_assoc($q)){
$nr++;
$now = microtime(true);
if (($nr % 100) == 0 && $argv[1] == "debug"){
print "$nr: " . fsize(memory_get_usage()) . ", time: " . sprintf("%.5f", ($now - $savetime)) . "\n";
}
foreach ($xsps as $xsp => $sites){
if (!in_array($r["citynet"], $sites)) continue 1;
$data = format_fields($r, $xsp);
$values = array_values($data);
$keys = array_keys($data);
$linefeed = "\n";
# TXT
if ($nr == 1) fwrite($text[$xsp], join("\t", $keys) . $linefeed);
fputcsv($text[$xsp], $values, "\t");
# CSV
if ($nr == 1) fwrite($csv[$xsp], join(";", $keys) . $linefeed);
fputcsv($csv[$xsp], $values, ";");
}
$savetime = microtime(true);
}
And the output, which prints every 100 rows, looks something like this:
12600: 6 Mb, time: 0.00000
12700: 6 Mb, time: 0.00000
12800: 6 Mb, time: 0.00000
12900: 6 Mb, time: 0.00001
13000: 6 Mb, time: 0.00000
13100: 6 Mb, time: 0.00000
13200: 6 Mb, time: 0.00000
So, obviously - what am I doing wrong? How can it take so long when the numbers says it doesn't?
EDIT So, obviously I had the way I calculated this wrong, so I edited this to:
while ($r=mysql_fetch_assoc($q)){
$nr++;
$start = microtime(true);
if (($nr % 100) == 0 && $argv[1] == "debug"){
print "$nr: " . fsize(memory_get_usage()) . ", time: " . sprintf("%.5f", $processtime) . "\n";
}
...
$stop = microtime(true);
$processtime = ($stop - $start);
}
And now it reports that each row takes 0.15 seconds to complete, which means the entire process takes 108 minutes. So the actual question is now - why is it so slow? Is it fputcsv() that is slow or is it PHP that is slow?