10

Running PHP 5.3.6 under MAMP on MAC, the memory usage increases every x calls (between 3 and 8) until the script dies from memory exhaustion. How do I fix this?

libxml_use_internal_errors(true);
while(true){
 $dom = new DOMDocument();
 $dom->loadHTML(file_get_contents('http://www.ebay.com/'));
 unset($dom);
 echo memory_get_peak_usage(true) . '<br>'; flush();
}

4 Answers4

22

Using libxml_use_internal_errors(true); suppresses error output but builds a continuous log of errors which is appended to on each loop. Either disable the internal logging and suppress PHP warnings, or clear the internal log on each loop iteration like this:

<?php
libxml_use_internal_errors(true);
while(true){
 $dom = new DOMDocument();
 $dom->loadHTML(file_get_contents('ebay.html'));
 unset($dom);
 libxml_use_internal_errors(false);
 libxml_use_internal_errors(true);
 echo memory_get_peak_usage(true) . "\r\n"; flush();
}
?>
Tak
  • 11,428
  • 5
  • 29
  • 48
  • 1
    Thank you so much. I have searched high and low for this and never thought to look at the implications of using `libxml_use_internal_errors(true)` –  Dec 05 '11 at 01:06
  • I suspect this is also the answer to: http://stackoverflow.com/questions/8188729/domdocument-xpath-leaking-memory-during-long-command-line-process-any-way-to –  Dec 05 '11 at 01:10
  • 2
    Feel free to cross post in the other answer and/or link here and spread the love. @FrancisAvila Thanks, well spotted :) – Tak Dec 05 '11 at 01:14
  • 1
    +100 if I could! Dude, you're a life saver. All this time I thought it was DOMDocument fault and kept hacking around it, time to refactor. =) – Alix Axel Apr 26 '12 at 18:37
  • Yeahhh!! I was blocked trying to know from where my memory leak came and you pointed in the right direction. Thanks you very much! – Ciges Feb 24 '17 at 11:23
3

Based on @Tak answer and @FrancisAvila comment, I found that this snippet works better for me:

while (true)
{
    $dom = new DOMDocument();

    if (libxml_use_internal_errors(true) === true) // previous setting was true?
    {
        libxml_clear_errors();
    }

    $dom->loadHTML(file_get_contents('ebay.html'));
}

print_r(libxml_get_errors()); // errors from the last iteration are accessible

This has the added benefits of 1) not discarding the errors of the last parse if you ever need to access them via libxml_get_errors(), and 2) calling libxml_clear_errors() only when necessary, since libxml_use_internal_errors() returns the previous setting state.

Community
  • 1
  • 1
Alix Axel
  • 151,645
  • 95
  • 393
  • 500
1

You can try forcing the garbage collector to run with gc_collect_cycles(), but otherwise you're out of luck. PHP doesn't expose much of anything to control its internal memory usage, let alone memory used by a plugin library.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • Unfortunately the memory keeps getting consumed using `gc_enable()` at start of script and `gc_collect_cycles()` at the end of each loop returns 0. –  Dec 05 '11 at 00:48
0

Testing your script locally produces the same result. Changing file_get_contents() to a local HTML file however produces a consistent memory usage. It could be that the output from ebay.com is changing every X calls.

Tak
  • 11,428
  • 5
  • 29
  • 48
  • Saving the eBay homepage to a local html file, and using that instead yields the same memory bloat for me. The included code is a simplification of my problem code - which is actually loading content from local caches. –  Dec 05 '11 at 00:56
  • 1
    Aha, found the solution - will post in separate answer – Tak Dec 05 '11 at 01:02