4

The following code is in a loop. Each loop changes URL to a new address. My problem is that each pass takes up more and more memory.

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://site.ru/');
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
curl_setopt($ch, CURLOPT_HEADER, false);

$html = new \DOMDocument();
$html->loadHTML(curl_exec($ch));

curl_close($ch);
$ch = null;

$xpath = new \DOMXPath($html);
$html = null;

foreach ($xpath->query('//*[@id="tree"]/li[position() > 5]') as $category) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $xpath->query('./a', $category)->item(0)->nodeValue);
    curl_setopt($ch, CURLOPT_TIMEOUT, 60);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
    curl_setopt($ch, CURLOPT_HEADER, false);

    $html = new \DOMDocument();
    $html->loadHTML(curl_exec($ch));

    curl_close($ch);
    $ch = null;

    // etc.
}

The memory is 2000 Mb. Script execution time ~ 2h. PHP version 5.4.4. How to avoid memory leak? Thanks!

o_flyer
  • 79
  • 2
  • 8

2 Answers2

4

Stories from the internet indicate that curl_setopt($ch, CURLOPT_RETURNTRANSFER, true) is broken in for some PHP/cURL versions:

You can also find stories for DOM:

Create a minimal test case which spots the cause of the leak. I.e. remove the unrelated package (DOM or cURL) from the code.

Then reproduce it with the latest PHP version. If it's still causing the leak, file a bug report else use that PHP version.

Community
  • 1
  • 1
Markus Malkusch
  • 7,738
  • 2
  • 38
  • 67
3

Reuse the same curl handle instead of creating and destroying it each time in your loop.

$ch = curl_init();
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, 'http://site.ru/');
curl_setopt($ch, CURLOPT_HEADER, false);
foreach ($pages as $url) {
    curl_setopt($ch, CURLOPT_URL, $url);
    $response = curl_exec($ch);
}
curl_close($ch);