Python toprettyxml() formatting problems

Question

I'm trying to process XML using Python's minidom, and then output the result using toprettyxml(). I ran into two problems:

There are added blank lines.
There are added newlines and tabs for text nodes.

Here's the code and output:

$ cat test.py
from xml.dom import minidom

dom = minidom.parse("test.xml")
print dom.toprettyxml()

$ cat test.xml
<?xml version="1.0" encoding="UTF-8"?>

<store>
    <product>
        <fruit>orange</fruit>
    </product>
</store>


$ python test.py
<?xml version="1.0" ?>
<store>


    <product>


        <fruit>
            orange
        </fruit>


    </product>


</store>

I can workaround problem 1 using strip() to remove blank lines, and I can workaround problem 2 using the hack (fixed_writexml) described in this link: http://ronrothman.com/public/leftbraned/xml-dom-minidom-toprettyxml-and-silly-whitespace/, but I was wondering if there's a better solution since the hack is almost 3 years old now. I'm open to using something other than minidom, but I'd like to avoid adding external packages like lxml.

You may check-out my solution - http://stackoverflow.com/a/39984422/2687547 — dganesh2002, Oct 21 '16 at 22:28
Does this answer your question? [Empty lines while using minidom.toprettyxml](https://stackoverflow.com/questions/14479656/empty-lines-while-using-minidom-toprettyxml) — Josh Correia, Feb 12 '20 at 17:38

score 2 · Accepted Answer · answered Jun 06 '11 at 18:46

2

One solution is to patch minidom Library with the proposed patch to the bug you mention.

I haven't tested myself, a bit hacky too, so it may not suit you!

answered Jun 06 '11 at 18:46

CharlesB

86,532
28
194
218

2

Thanks, I tested the patch, and it fixes both issues! Instead of directly patching minidom.py in /usr/lib/python, I did something similar to the ronrothman link above, where the function is replaced at runtime. That way, it can run anywhere. – Ravi Jun 06 '11 at 19:23
3

Hey, could you please share your solution to the patch ? thanks ! – Igal Jan 23 '13 at 14:43

Python toprettyxml() formatting problems

1 Answers1