1

I have two XML files: one from a client and one created from a db query. The db XML file has this structure:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <tags>
        <title>Wordsleuth (2006, volume 3, 4): The Dictionary: Disapproving Schoolmarm or Accurate Record?</title>
        <alias>favart/wordsleuth-2006-volume-3-4-the-dictionary-disapproving-schoolmarm-or-accurate-record</alias>
        <id>4361</id>
    </tags>
</metadata>

The client XML has this structure:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <tags>
        <title>Wordsleuth (2006, vol. 3, 4): The Dictionary: Disapproving Schoolmarm or Accurate Record? – Search by Title – Favourite Articles – TERMIUM Plus® – Translation Bureau</title>
        <description>A Language Update article on the role that the dictionary plays in language usage.</description>
        <keywords>language usage; dictionaries</keywords>
        <subject>English language; Terminology</subject>
    </tags>
</metadata>

Each with approx 200 'tags' elements. After getting some hints from here and here and referencing the PHP manual my first crack at it produced this:

$client = 'C:\xampp\htdocs\wetkit\sites\all\modules\my_metatags\favart.xml';
$db = 'C:\xampp\htdocs\wetkit\sites\all\modules\my_metatags\tmp\from db\favart_db.xml';
$c_xmlstr = file_get_contents($client);
$d_xmlstr = file_get_contents($db);
$favartdoc_db = new DomDocument('1.0','UTF-8');
$favartdoc_cl = new DomDocument('1.0','UTF-8');
$favartdoc_db->loadXML($d_xmlstr);
$favartdoc_cl->loadXML($c_xmlstr);
for ($i=0;$i==$favartdoc_cl->getElementsByTagName('title')->count; $i++){
    $c_nodes = $x_favartdoc_cl->query('/metadata/tags/title');
    $c_node = $c_nodes->item($i);
for ($j=0; $j==$favartdoc_db->getElementsByTagName('title')->count; $j++){
        $d_nodes = $x_favartdoc_db->query('/metadata/tags/title');
        $d_node = $d_nodes->item($j);
        if(stripos(trim($c_node->nodeValue), trim($d_node->nodeValue))===0){
        $favartdoc_cl->replaceChild($d_node,$c_node);
        if($i==($c_nodes->count)){break;};
        }
}
$favartdoc_cl->saveXML();
}

This code runs, generates no errors, and does nothing. An echo statement at the end

echo "\n\n" . "THE TOTAL NUMBER OF MATCHES EQUALS " . $i . " IN " . $j . " NODES." . "\n";

generates this message:

THE TOTAL NUMBER OF MATCHES EQUALS 1 IN 1 NODES.

A second simpler approach produced this:

$favartdoc_db = new DomDocument('1.0','UTF-8');
$favartdoc_cl = new DomDocument('1.0','UTF-8');
$favartdoc_db->load($db);
$favartdoc_cl->load($client);
$favartdoc_cl->formatOutput = true;
$c_meta_x =  new DOMXpath($favartdoc_cl);
$d_meta_x =  new DOMXpath($favartdoc_db);
foreach ($c_meta_x->query('//tags') as $c_tag){
        foreach ($d_meta_x->query('//tags') as $d_tag){
            if(strncasecmp(trim($c_tag->title), trim($d_tag->title) , strlen(trim($d_tag->title)))===0){
                $c_tag->appendChild($d_tag);
            }           
        }
}
$favartdoc_cl->saveXML();

But this generates an error:

exception 'DOMException' with message 'Wrong Document Error'

Suggestions to correct that error, by calling importNode before attaching it to the DOM, still generate the same error.

As you can see I'm trying a different string matching function in each. Ultimately I want to replace the titles in the client XML with those from the db or append the whole tag set from the db XML to the client XML then delete the client title element afterwards.

Any help would be appreciated.

Community
  • 1
  • 1
devnull
  • 9
  • 4

1 Answers1

0

This is what worked for me.

$client  = 'some\where\somefile.xml';
$db = 'some\where\someOtherfile.xml';

$c_xmlstr = file_get_contents($client);
$d_xmlstr = file_get_contents($db);

$doc_db = new DomDocument('1.0','UTF-8');
$doc_cl = new DomDocument('1.0','UTF-8');


$doc_db->loadXML($d_xmlstr);
$fdoc_cl->loadXML($c_xmlstr);

$x_doc_db = new DOMXpath($doc_db);
$x_doc_cl = new DOMXpath($doc_cl);

$c_nodes = $x_doc_cl->query('/metadata/tags');
$c_nodes_titles = $x_doc_cl->query('/metadata/tags/title');

    for($i=0;$i<=$c_nodes->length;++$i){

      $c_node = $c_nodes->item($i);
      $c_node_title = $c_nodes_titles->item($i);

      $d_nodes = $x_doc_db->query('/metadata/tags');
      $d_nodes_titles = $x_doc_db->query('/metadata/tags/title');
      $d_nodes_ids = $x_doc_db->query('/metadata/tags/id');

      for($j=0;$j<=$d_nodes->length;++$j){  

        $d_node_title = $d_nodes_titles->item($j);
        $d_node_id = $d_nodes_ids->item($j);

        if(strncasecmp(trim($c_node_title->textContent),trim($d_node_title->textContent) , strlen(trim($d_node_title->textContent)))===0 && trim($c_node_title->textContent)===trim($d_node_title->textContent)){

          $db_id = $doc_cl->createElement("db_id");
          $db_id_val = $doc_cl->createTextNode($d_node_id->nodeValue);

          if(!is_null($c_node)){$c_node->appendChild($db_id);}
          if(!is_null($c_node)){$c_node->appendChild($db_id_val);}

        }
      }
      if($i===($c_nodes->count) && $j===($d_nodes->count)){break;};
    }
$doc_cl->saveXML();
devnull
  • 9
  • 4
  • Although this is working code that creates the elements and inserts them where I expect I have no illusions that someone can code a more elegant solution. – devnull Feb 25 '17 at 20:16