1

I'm trying to delete all the NodePrices that do not match a specific NodeName, for this example Place2. Here's a sample of the XML

<DocHeader>
    <DocTitle>Node Price Report</DocTitle>
    <DocRevision>1</DocRevision>
    <DocConfidentiality>
        <DocConfClass>PUB</DocConfClass>
    </DocConfidentiality>
    <CreatedAt>2018-02-03T13:02:01</CreatedAt>
</DocHeader>
<DocBody>
  <NodePrices>
    <NodeName>Place1</NodeName>
    <Contact>Employee1</Contact>
  </NodePrices>
  <NodePrices>
    <NodeName>Place2</NodeName>
    <Contact>Employee2</Contact>
  </NodePrices>
  <NodePrices>
    <NodeName>Place3</NodeName>
    <Contact>Employee3</Contact>
  </NodePrices>
</DocBody>

I found a previously asked question that looks like the answer to my question however the results are not what I expected. When I run the code and echo the results they are what I expect, I see Place2.When I save the results to file Place2 is missing all I have is the DocHeader. What am I doing wrong?

The previous post is How to modify xml file using PHP

Here's my PHP

$dom=new DOMDocument();
$dom->load("Nodes.xml");

$root=$dom->documentElement; 

$nodesToDelete=array();

$markers=$root->getElementsByTagName('NodePrices');

// Loop trough childNodes
foreach ($markers as $marker) {
    $NodeName=$marker->getElementsByTagName('NodeName')->item(0)->textContent;

    if($NodeName=='Place2') {
        continue;
    }

    $nodesToDelete[]=$marker;
}

// You delete the nodes
foreach ($nodesToDelete as $node) {
    $node->parentNode->removeChild($node);
}

echo $dom->saveXML();
$dom->save('FilteredNodes.xml');
Phil
  • 157,677
  • 23
  • 242
  • 245
L Helmer
  • 11
  • 1
  • 1
    Please post a fuller XML with root and abbreviate repeating nodes with `...`. – Parfait Feb 04 '18 at 22:32
  • How are you verifying the problem? How are you viewing the file? Are you looking at the right file at all? I'd try `$dom->save(__DIR__ . '/FilteredNodes.xml');` to make sure the file saves in the same directory as your script – Phil Feb 05 '18 at 00:28
  • Just realized that the code works however it leaves a blank line for every node that is removed. In my case this results in about 400 blank lines. When I was verifying the results I didn't scroll down to find the data I was looking for. My question should have been... "How do I remove all these blank lines? I tried preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string); however that didn't remove the empty lines. Still looking for a solution. – L Helmer Feb 05 '18 at 10:53
  • @LHelmer ... to help please post a complete XML file. Right now, your post has no root element which is not a well-formed XML. – Parfait Feb 05 '18 at 17:11

1 Answers1

0

When removing a node in XML and DOM, this will commonly leave a gap. This is due to the formatting of the document and that there is usually a DOMText node prior to the actual data node. To close this gap up you also need to remove this node as well as the data...

foreach ($nodesToDelete as $node) {
    $prevNode = $node->previousSibling;
    if ( $prevNode != null && $prevNode instanceof DOMText )    {
        $node->parentNode->removeChild($prevNode);
    }
    $node->parentNode->removeChild($node);
}
Nigel Ren
  • 56,122
  • 11
  • 43
  • 55