4

I need to parse a very big XML file, with a filesize of 750Mo !

I have meomy limit at 512M

ini_set('memory_limit', '512M');

I have no problem to open file under 30Mo, but with 750Mo, I obtain a fatal error

Fatal error: Allowed memory size of 1677721600 bytes exhausted (tried to allocate 2988843769 bytes)

I do that to open files :

$fichier = file_get_contents($inputfileName);
$xmlInput = simplexml_load_string(utf8_encode($fichier));

Have you an idea to open this file ?

Haim Evgi
  • 123,187
  • 45
  • 217
  • 223
bahamut100
  • 1,795
  • 7
  • 27
  • 38
  • Increase the memory limit again?... If that's not an option, what are you doing with the contents of the file? That info is probably needed in order to give any further advice. – Pekka Sep 07 '11 at 07:49
  • I don't see why a XML reader would need to allocate four times the size of the file. Can't you `mmap` this a way or another, split the file (according to the structure) and process the bits one at a time with your favorite XML reader ? – Alexandre C. Sep 07 '11 at 07:51
  • The use of XMLReader seems be fix the problem – bahamut100 Sep 07 '11 at 08:01
  • possible duplicate of [Parse big XML in PHP](http://stackoverflow.com/questions/659369/parse-big-xml-in-php) – Gordon Sep 07 '11 at 08:05
  • For starters, **don't use `_string`** but **use `simplexml_load_file`** - that will put the memory upon the system, not PHP memory limit, however, if even an OS RAM limit would be issue - then use a cursor-like traversing. – jave.web Jan 08 '23 at 21:59

3 Answers3

6

Using the DOM based extensions will take up significantly more memory as the raw XML is because the XML will be parsed completely into a tree structure of nodes. Have a look at XMLReader instead

The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.

and make sure you parse with LIBXML_PARSEHUGE

An alternative would the event-based XMLParser

Gordon
  • 312,688
  • 75
  • 539
  • 559
0

For big file, perfect use XMLReader class. But if liked simplexml:

Code: https://github.com/dkrnl/SimpleXMLReader/blob/master/library/SimpleXMLReader.php

Usage example: http://github.com/dkrnl/SimpleXMLReader/blob/master/examples/example1.php

0

You want a SAX or other event-based xml parser. Google 'php sax parser'.

phs
  • 10,687
  • 4
  • 58
  • 84