PHP Remove Empty Node Values From XML

Question

I have generated an xml. There are few empty nodes which I want to remove

My XML

https://pastebin.com/wzjmZChU

I want to remove all empty nodes from my xml. Using xpath I tried

$xpath = '//*[not(node())]';
foreach ($xml->xpath($xpath) as $remove) {
    unset($remove[0]);
}

The above code is working to a certain level but I am not able to remove all empty node values.

Edit

I have tried the above code and it only works for single level.

Possible duplicate of [Remove multiple empty nodes with SimpleXML](http://stackoverflow.com/questions/5559551/remove-multiple-empty-nodes-with-simplexml) — splash58, Apr 17 '17 at 12:28
what about this? http://stackoverflow.com/questions/8603237/remove-empty-tags-from-a-xml-with-php — Oz Radiano, Apr 17 '17 at 17:15

score 4 · Answer 1 · answered Apr 21 '17 at 14:41

You consider any element node without a child empty //*[not(node())] will accomplish that. But if it removes the element nodes it can result in additional empty nodes, so you will need an expression that does not only remove the currently empty element nodes, but these with only empty descendant nodes (recursively). Additionally you might want to avoid to remove the document element even if it is empty because that would result in an invalid document.

Building up the expression

Select the document element
/*
Any descendant of the document element
/*//*
...with only whitespaces as text content (this includes descendants)
/*//*[normalize-space(.) = ""]
...and no have attributes
/*//*[normalize-space(.) = "" and not(@*)]
...or an descendants with attributes
/*//*[normalize-space(.) = "" and not(@* or .//*[@*])]
...or a comment
/*//*[normalize-space(.) = "" and not(@* or .//*[@*] or .//comment())]
...or a pi
/*//*[ normalize-space(.) = "" and not(@* or .//*[@*] or .//comment() or .//processing-instruction()) ]

Put together

Iterate the result in reverse order, so that child nodes are deleted before parents.

$xmlString = <<<'XML'
<foo>
  <empty/>
  <empty></empty>
  <bar><empty/></bar>
  <bar attr="value"><empty/></bar>
  <bar>text</bar>
  <bar>
   <empty/>
   text
  </bar>
  <bar>
   <!-- comment -->
  </bar>
</foo>
XML;

$xml = new SimpleXMLElement($xmlString);

$xpath = '/*//*[
  normalize-space(.) = "" and
  not(
    @* or 
    .//*[@*] or 
    .//comment() or
    .//processing-instruction()
  )
]';
foreach (array_reverse($xml->xpath($xpath)) as $remove) {
  unset($remove[0]);
}

echo $xml->asXml();

Output:

<?xml version="1.0"?>
<foo>



  <bar attr="value"/>
  <bar>text</bar>
  <bar>

   text
  </bar>
  <bar>
   <!-- comment -->
  </bar>
</foo>

PHP Remove Empty Node Values From XML

1 Answers1

Building up the expression

Put together