1

I have a 52GB xml file that I need to insert into a database but I don't know the structure. I've been searching on how to iterate it with XMLReader but it seems that I have to know the structure to do this. If I do next() more than one time it just goes to the end of the file and if I do it just one time it gives me the first node where all the data is and I can't see anything because of memory issues.

    $reader = new XMLReader();
    $reader->open('D:\_WORK\ESStatistikListeModtag.xml');
    $reader->read();
    $reader->next();
    var_dump($reader->expand());

This is what I tried and I tried different functions of XMLReader with no success. How can I do this? Thanks for any help or advice.

Armenio
  • 15
  • 3
  • You will need to know the structure, or better the rules for the structure - so you can map them to relational (database) data. XMLReader+DOM will work, but you have to define and develop the mapping logic. – ThW Jun 03 '15 at 10:04
  • does the xml have any `xsd` or `dtd` defined? If so, it can give you an idea of the xml structure. – web-nomad Jun 03 '15 at 14:23
  • Use either of the softwares to load your xml to see the structure - http://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files – web-nomad Jun 03 '15 at 14:26

1 Answers1

0

That can not be taken immediately take on both parts

This simple code may help you to understand structure. Set path array to empty and look in output top level, foe example, . Set it in array path and watch next level - and . If you do not like public transportation :), write 'car' into array and watch next level...

$xml = new XMLReader(); 
$xml->open(FILENAME);
$path = array('root', 'car'...); 

$pp = array();
$selected = 0;           // requied fragment
$l = 0;                  // current level
$level = count($path);   // level to watch

while ($xml->read()) {
     if ($xml->nodeType == XMLReader::ELEMENT) {
        // Element start
        if ($l < $level) array_push($pp, $xml->name);
        if (($l == ($level-1) || !$level) && $path == $pp) { echo implode(', ', $pp)."<br>"; $selected = 1; }
        if (($l == $level) && ($selected )) echo "&nbsp;&nbsp;&nbsp;".$xml->name."<br>";
        $l++;
     }
    else if($xml->nodeType == XMLReader::END_ELEMENT) {
        // Element end
        if ($selected && ($l == $level)) {
            $selected = 0; 
            // you may write die here if you dont wait a repeats of "path"
        }
        $l--;
        if ($l < $level) array_pop($pp);       
    }
 }
splash58
  • 26,043
  • 3
  • 22
  • 34
  • Thanks. It didn't work very well, the names weren't showing, I made some changes and had to put a counter in the while loop because the browser was crashing and it finally worked. – Armenio Jun 04 '15 at 11:13
  • I glad you found the solution. Probably, i use not suitable xml example. For me it worked OK. May be, yuo place your code on eval.in . I wonder – splash58 Jun 04 '15 at 11:18