0

I try to insert a tag in a big xml file. And I only want change that line that I insert, not all the other lines. Because the logic, when I must insert the lines, is not that simple, I have to use a xml parser. But the parser that I used, changed the other lines as well.

original file:

<?xml version='1.0' encoding='UTF-8'?>
<tag>
    <!-- Comment with ÄÜÖ -->
    <name>test</name>
    <name></name>
</tag>

desired file:

<?xml version='1.0' encoding='UTF-8'?>
<tag>
    <!-- Comment with ÄÜÖ -->
    <name>test</name>
    <name>test2</name>
    <name></name>
</tag>

lxml for example shows the character entities and automatically shortens empty tags after saving.

lxml result:

<tag>
    <!-- Comment with &#195;&#132&#195;&#156;&#195;&#150; -->
    <name>test</name>
    <name>test2</name>
    <name />
</tag>

lxml code:

from lxml import etree

with open('org.xml', 'r') as xml_file:
    xml_tree = etree.parse(xml_file)
// adding/manipulating xml
// ...
// saving xml
with open('final.xml', 'wb') as final_file:
    final_file.write(etree.tostring(xml_tree))
Sir2B
  • 1,029
  • 1
  • 10
  • 17
  • How does the **actual** *xml* look like? – CristiFati Apr 25 '18 at 12:19
  • Can you show how you update or append the xml please? – Sumit Jha Apr 25 '18 at 12:20
  • https://stackoverflow.com/questions/3884876/how-to-create-an-xml-text-node-with-an-empty-string-value-in-java might be helpul to undestand xml empty node. – Sumit Jha Apr 25 '18 at 12:22
  • It's irrelevant how I change the file. That would be a very long code. It's only about when I'm opening the file with an xml parser and creating an xml file again, that there are changes in the file, even I haven't actually made any changes to the xml. – Sir2B Apr 25 '18 at 13:03
  • @SumitJha so it's not possible to create the primordially xml with the same representation of empty nodes? Would there be an solution for avoing the character entities? – Sir2B Apr 25 '18 at 13:05
  • You can have look at method html while coverting xml to string [here](https://stackoverflow.com/questions/12460677/python-etree-control-empty-tag-format). However I think you loose some information some xml headers. – Sumit Jha Apr 25 '18 at 13:21
  • Set the encoding: `final_file.write(etree.tostring(xml_tree, encoding="utf-8"))` – Josh Voigts Apr 25 '18 at 13:22

0 Answers0