4

If I have a file like so for example:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
</data>

And if I append an element:

newTagContentString = """
<usertype id="99999">
    <role name="admin" />
</usertype>"""
c.append(newXMLElement)

It isn't properly indentated:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
<usertype id="99999">
    <role name="admin" />
</usertype></data>

Is there a way to make it properly indentate?

BTW c.insert(0, newXMLElement) also doesn't keep nice spacing:

<data>
    <usertype id="99999">
    <role name="admin" />
</usertype><country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
</data>
shinzou
  • 5,850
  • 10
  • 60
  • 124
  • Similar questions have been asked before. For example: https://stackoverflow.com/questions/28813876/how-do-i-get-pythons-elementtree-to-pretty-print-to-an-xml-file. If you can use lxml instead of ElementTree, then pretty-printing is easier. – mzjn Mar 25 '18 at 19:22

1 Answers1

2

I'm assuming the problem you're facing is a printing issue. Here's a code snippet using the minidom module, which automatically parses your xml in the desired format:

import xml.etree.ElementTree as ET
import xml.dom.minidom

parent_file_path = 'files/49473329.xml'
parent_tree = ET.parse(parent_file_path)
parent = parent_tree.getroot()
xmlstr = xml.dom.minidom.parseString(ET.tostring(parent)).toprettyxml()
print xmlstr

Where 'files/49473329.xml' is your mis-parsed file:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor direction="E" name="Austria" />
        <neighbor direction="W" name="Switzerland" />
    </country>
<usertype id="99999">
    <role name="admin" />
</usertype></data>

Hope this helps

Jimmy Lee Jones
  • 785
  • 4
  • 18
  • 1
    But this also looks not properly spaced, the closing data tag should be in the next line and there's no indentation for that element. – shinzou Mar 25 '18 at 08:15
  • @shinzou Can you please add the output you're getting? – Jimmy Lee Jones Mar 25 '18 at 08:22
  • https://i.imgur.com/ahcRiaV.png I see you meant to prettify it, but it looks like it adds a newline before and after every line, that's not what I want. – shinzou Mar 25 '18 at 09:04
  • @shinzou this is a known issue with minidom, once you get the parsed XML string, you can strip away any unwanted newlines – Jimmy Lee Jones Mar 25 '18 at 09:13
  • Uh so more logic... How come something so basic isn't part of minidom or elementree? I'm sure I'm not the first one with this problem. – shinzou Mar 25 '18 at 09:18
  • Is there at least a way to not include the `` from the start of the file? – shinzou Mar 25 '18 at 10:05