0

I am reading an xml file, adding some tags and writing it.

The file i read have <?xml version="1.0" encoding="UTF-8" standalone="yes"?> my output only has <?xml version="1.0" ?>

I use the following Code

import os
from xml.dom import minidom
import xml.etree.ElementTree as ET

    tree = ET.parse(xml_file)
    root = tree.getroot()
    access = ""

    # ... (rest of the processing logic)

    # Write to a temporary string to control indentation
    rough_string = ET.tostring(root, 'utf-8')
    reparsed = minidom.parseString(rough_string)

    # Write the formatted XML to the original file without empty lines and version information
    with open(xml_file, 'w', encoding='utf-8') as f:
        for line in reparsed.toprettyxml(indent="  ").splitlines():
            if line.strip():
                f.write(line + '\n')

How can i preserve the XML declaration from my original document?

Edit:

I solved it by manually adding the line

    with open(xml_file, 'w', encoding='utf-8') as f:
        custom_line = '<?xml version="1.0" encoding="UTF-8"  standalone="yes"?>'
        f.write(custom_line + '\n')
        for line in reparsed.toprettyxml(indent="  ").splitlines():
            if line.strip() and not line.startswith('<?xml'):
                f.write(line + '\n')
Marc
  • 199
  • 8
  • If you have a solution, post an Answer. The solution to a problem does not belong in the Question. – mzjn Aug 04 '23 at 09:47
  • There are many similar questions already. What about (for example) https://stackoverflow.com/q/68023690/407651 and https://stackoverflow.com/q/15356641/407651? – mzjn Aug 04 '23 at 09:54

2 Answers2

0

I think xml.etree.ElementTree doesn’t support standalone in the xml_declaration. With minidom you can do it, like:

from xml.dom.minidom import parseString

dom3 = parseString('<myxml>Some data<empty/> some more data</myxml>')

# write declaration with standalone
with open("myfile.xml", "w") as xml_file:
    dom3.writexml(xml_file, indent='  ', newl='\n', encoding='utf-8', standalone=True)

Gives the xml declaration:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
. . .

Find the documentation. Or as an alternative solution for xml.etree.ElementTree you can find here

Hermann12
  • 1,709
  • 2
  • 5
  • 14
0

I solved it by adding this lines

with open(xml_file, 'w', encoding='utf-8') as f:
    custom_line = '<?xml version="1.0" encoding="UTF-8"  standalone="yes"?>'
    f.write(custom_line + '\n')
    for line in reparsed.toprettyxml(indent="  ").splitlines():
        if line.strip() and not line.startswith('<?xml'):
            f.write(line + '\n')
Marc
  • 199
  • 8