A few points to mention here:
Firstly, your test element.text is not None
always returns True
if you parse your XML file as given above using xml.etree.Elementree
since at the end of each node, there is a new line character, hence, the text in each supposedly not-having-text node always have \n
character. An alternative is to use lxml.etree.parse
with a lxml.etree.XMLParser
that ignore the blank text as below.
Secondly, it's not good to append to a tree while reading through it. The same reason for why this code will give infinite loop:
>>> a = [1,2,3,4]
>>> for k in a:
a.append(5)
You could see @Alex Martelli answer for this question here: Modifying list while iterating regarding the issue.
Hence, you should make a buffer XML tree and build it accordingly rather than modifying your tree while traversing it.
from xml.etree import ElementTree as et
import pdb;
from lxml import etree
p = etree.XMLParser(remove_blank_text=True)
path = 'test.xml'
tr = et.parse(path, parser = p)
root = tr.getroot()
buffer = et.Element(root.tag);
for node in root.getchildren():
bnode = et.Element(node.tag)
for element in node.iter():
#pdb.set_trace()
if (element.text is not None):
bnode.append(element)
#node.extend(element)
buffer.append(bnode)
et.dump(buffer)
Sample run and results:
Chip chip@ 01:01:53@ ~: python stackoverflow.py
<root><node1><name1>text1</name1></node1><node2><name2>text2</name2></node2></root>
NOTE: you can always try to print a pretty XML tree using lxml
package in python following tutorials here: Pretty printing XML in Python since the tree I printed out is rather horrible to read by naked eyes.