I have the following xml output:
<?xml version='1.0' encoding='ISO-8859-1'?>
<?xml-stylesheet type='text/xsl' href='image_metadata_stylesheet.xsl'?>
<dataset>
<images>
<image file='VideoExtract/testset/10224.jpg'>
<box top='436' left='266' width='106' height='61'>
<label>1</label>
</box>
</image>
<image file='VideoExtract/testset/1044.jpg'>
<box top='507' left='330' width='52' height='27'>
<label>2</label>
</box>
</image>
<image file='VideoExtract/testset/10675.jpg'>
</image>
</images>
</dataset>
From this, I want to delete all the nodes that doesn't have any child nodes. For example, the third image node within images does not have child node. How can I delete this child node. The desired output would be
<?xml version='1.0' encoding='ISO-8859-1'?>
<?xml-stylesheet type='text/xsl' href='image_metadata_stylesheet.xsl'?>
<dataset>
<images>
<image file='VideoExtract/testset/10224.jpg'>
<box top='436' left='266' width='106' height='61'>
<label>1</label>
</box>
</image>
<image file='VideoExtract/testset/1044.jpg'>
<box top='507' left='330' width='52' height='27'>
<label>2</label>
</box>
</image>
</images>
</dataset>
I have tried the following, but it doesn't help.
from lxml import etree as ET
root = ET.parse('testxml.xml')
for child in root.iterfind('targetElement'):
if(len(child.attrib) < 1 and len(child) < 1):
child.getparent().remove(child)