1

i have a problem using XMLParser and simplexml_load_dom. Im trying to search in 4 with 2MB each files and in a 27 MB file. The problem is not with the memory but with the execution time (around 50s). How can i optimize the code?

public function searchInFeed()
{
    $feed =& $this->getModel('feed','afiliereFeeduriModel');
    $myfeeds = $feed->selectFeed();
    foreach ($myfeeds as $f)
    {
        $x = new XMLReader();

        $x->open($f->url);
        $z = microtime(true);
        $doc = new DOMDocument('1.0', 'UTF-8');
        set_time_limit(0);

        while ($x->read())
        {

            if ($x->nodeType === XMLReader::ELEMENT)
            {
                $nod = simplexml_import_dom($doc->importNode($x->expand(), true));

                $data['text'] = 'Chicco termometru';
                $data['titlu'] = 'title';
                $data['nod'] = &$nod;
                if ($this->searchInXML($data))
                {
                    echo $nod->title."<br>";
                }
                $x->next();
            }
        }
    }
    echo microtime(true) - $z."<br>";
    echo memory_get_usage()/1024/1024;
    die();
}
hakre
  • 193,403
  • 52
  • 435
  • 836
  • 3
    This seems similar to http://stackoverflow.com/questions/1167062/best-way-to-process-large-xml-in-php - basically, switching to something like [XML Reader](http://nl.php.net/manual/en/book.xmlreader.php) should give you more speed and less memory usage. – Jory Geerts Jan 26 '12 at 10:33
  • i'm using XMLReader, but it takes too much time to execute the code. – Bogdan Ungureanu Jan 26 '12 at 10:37
  • This `simplexml_import_dom($doc->importNode($x->expand()` thing is done for _each single_ element in the document; I doubt that's any faster than using the DOM/SimpleXML extension for the whole document, esp. when you count in the php-script's loop/method-call/return-value/... overhead. And it's rather unlikely that it uses less memory either since the elements are kept in the DOMDOcument anyway... | And we don't know what `searchInXML()` does. You'd have to re-write this method to gain "anything" from XMLReader – VolkerK Jan 26 '12 at 10:46
  • 1
    why are you using SimpleXml, XMLReader and DOM all in one script? – Gordon Jan 26 '12 at 10:51
  • 1
    also, my naive assumption would be the issue is network latency. have you profiled your script yet? if not, do so please. if yes, please update your question and tell us which parts are slow. – Gordon Jan 26 '12 at 10:55
  • Metric each part of your function, not only the whole function. You can only tell what takes how long if you differentiate the metrics, then pick the part(s) that take longest and improve them. Unless you don't do, the only answer is: Improve the speed of your function for whatever reason causes the duration, which is not helpful. If you add more specific information to your question, others might be able to provide better answers. – hakre Jan 26 '12 at 11:28
  • Please provide a short example code (and associated XML) *that we can execute*, which demonstrates the problem. – salathe Jan 26 '12 at 11:49
  • If you just use XMLReader() and handle the events you need, this should be quiet fast. Anyways...could you please explain what exactly you are trying to find in the Document? And how it looks like? I don't really get the point of the DOM-Operations, and they are quiet slow... – Max Jan 26 '12 at 12:31

0 Answers0