79

I'm creating an web api and need a good way to very quickly generate some well formatted xml. I cannot find any good way of doing this in python.

Note: Some libraries look promising but either lack documentation or only output to files.

martineau
  • 119,623
  • 25
  • 170
  • 301
Joshkunz
  • 5,575
  • 6
  • 28
  • 24

6 Answers6

99

ElementTree is a good module for reading xml and writing too e.g.

from xml.etree.ElementTree import Element, SubElement, tostring

root = Element('root')
child = SubElement(root, "child")
child.text = "I am a child"

print(tostring(root))

Output:

<root><child>I am a child</child></root>

See this tutorial for more details and how to pretty print.

Alternatively if your XML is simple, do not underestimate the power of string formatting :)

xmlTemplate = """<root>
    <person>
        <name>%(name)s</name>
        <address>%(address)s</address>
     </person>
</root>"""

data = {'name':'anurag', 'address':'Pune, india'}
print xmlTemplate%data

Output:

<root>
    <person>
        <name>anurag</name>
        <address>Pune, india</address>
     </person>
</root>

You can use string.Template or some template engine too, for complex formatting.

kolypto
  • 31,774
  • 17
  • 105
  • 99
Anurag Uniyal
  • 85,954
  • 40
  • 175
  • 219
  • 8
    Be careful with the second method, as it does not quote special characters, so if your data contains characters such as `<>&` you can end up with malformed xml. – zch Jul 23 '15 at 11:41
  • 1
    ...or worse, it may allow an injection attack, depending on if you put any sort of user input there. – Nick T Apr 03 '21 at 03:11
97

Using lxml:

from lxml import etree

# create XML 
root = etree.Element('root')
root.append(etree.Element('child'))
# another child with text
child = etree.Element('child')
child.text = 'some text'
root.append(child)

# pretty string
s = etree.tostring(root, pretty_print=True)
print s

Output:

<root>
  <child/>
  <child>some text</child>
</root>

See the tutorial for more information.

Community
  • 1
  • 1
ars
  • 120,335
  • 23
  • 147
  • 134
23

I would use the yattag library.

from yattag import Doc

doc, tag, text = Doc().tagtext()

with tag('food'):
    with tag('name'):
        text('French Breakfast')
    with tag('price', currency='USD'):
        text('6.95')
    with tag('ingredients'):
        for ingredient in ('baguettes', 'jam', 'butter', 'croissants'):
            with tag('ingredient'):
                text(ingredient)
    

print(doc.getvalue())

FYI I'm the author of the library.

John Smith Optional
  • 22,259
  • 12
  • 43
  • 61
  • 2
    I am not sure, if I should think it's beautiful or ugly actually. I've used the `with` statement for opening files so far and I think of it as a help to "clean up" or "close" whatever I write directly after the `with` statement.So in this case it would close the tags? Or would it throw them away like a file handle, when opening files? If it throws it away, then why is it still in the final output? Must be because of that `text()` function. But isn't that circumventing the character of the `with` statement? – Zelphir Kaltstahl Jul 27 '15 at 23:13
  • 2
    Official docs at yattag.org have a good explanation of how that works "The tag method returns a context manager. In Python, a context manager is an object that you can use in a with statement. Context managers have __enter__ and __exit__ methods. The __enter__ method is called at the beginning of the with block and the __exit__ method is called when leaving the block. Now I think you can see why this is useful for generating xml or html. with tag('h1') creates a

    tag. It will be closed at the end of the with block. This way you don't have to worry about closing your tags."

    – Anton Matosov Sep 06 '20 at 06:01
17

Use lxml.builder class, from: http://lxml.de/tutorial.html#the-e-factory

import lxml.builder as lb
from lxml import etree

nstext = "new story"
story = lb.E.Asset(
  lb.E.Attribute(nstext, name="Name", act="set"),
  lb.E.Relation(lb.E.Asset(idref="Scope:767"),
            name="Scope", act="set")
  )

print 'story:\n', etree.tostring(story, pretty_print=True)

Output:

story:
<Asset>
  <Attribute name="Name" act="set">new story</Attribute>
  <Relation name="Scope" act="set">
    <Asset idref="Scope:767"/>
  </Relation>
</Asset>
Lars Nordin
  • 2,785
  • 1
  • 22
  • 25
  • This is pretty awesome, thanks! Wanted to use Yattag offered by John Smith Optional, but was really glad to know that my favorite lxml has the same approach. – Nikita Hismatov Nov 18 '19 at 14:53
16

An optional way if you want to use pure Python:

ElementTree is good for most cases, but it can't CData and pretty print.

So, if you need CData and pretty print you should use minidom:

minidom_example.py:

from xml.dom import minidom

doc = minidom.Document()

root = doc.createElement('root')
doc.appendChild(root)

leaf = doc.createElement('leaf')
text = doc.createTextNode('Text element with attributes')
leaf.appendChild(text)
leaf.setAttribute('color', 'white')
root.appendChild(leaf)

leaf_cdata = doc.createElement('leaf_cdata')
cdata = doc.createCDATASection('<em>CData</em> can contain <strong>HTML tags</strong> without encoding')
leaf_cdata.appendChild(cdata)
root.appendChild(leaf_cdata)

branch = doc.createElement('branch')
branch.appendChild(leaf.cloneNode(True))
root.appendChild(branch)

mixed = doc.createElement('mixed')
mixed_leaf = leaf.cloneNode(True)
mixed_leaf.setAttribute('color', 'black')
mixed_leaf.setAttribute('state', 'modified')
mixed.appendChild(mixed_leaf)
mixed_text = doc.createTextNode('Do not use mixed elements if it possible.')
mixed.appendChild(mixed_text)
root.appendChild(mixed)

xml_str = doc.toprettyxml(indent="  ")
with open("minidom_example.xml", "w") as f:
    f.write(xml_str)

minidom_example.xml:

<?xml version="1.0" ?>
<root>
  <leaf color="white">Text element with attributes</leaf>
  <leaf_cdata>
<![CDATA[<em>CData</em> can contain <strong>HTML tags</strong> without encoding]]>  </leaf_cdata>
  <branch>
    <leaf color="white">Text element with attributes</leaf>
  </branch>
  <mixed>
    <leaf color="black" state="modified">Text element with attributes</leaf>
    Do not use mixed elements if it possible.
  </mixed>
</root>
SergO
  • 2,703
  • 1
  • 30
  • 23
1

I've tried a some of the solutions in this thread, and unfortunately, I found some of them to be cumbersome (i.e. requiring excessive effort when doing something non-trivial) and inelegant. Consequently, I thought I'd throw my preferred solution, web2py HTML helper objects, into the mix.

First, install the the standalone web2py module:

pip install web2py

Unfortunately, the above installs an extremely antiquated version of web2py, but it'll be good enough for this example. The updated source is here.

Import web2py HTML helper objects documented here.

from gluon.html import *

Now, you can use web2py helpers to generate XML/HTML.

words = ['this', 'is', 'my', 'item', 'list']
# helper function
create_item = lambda idx, word: LI(word, _id = 'item_%s' % idx, _class = 'item')
# create the HTML
items = [create_item(idx, word) for idx,word in enumerate(words)]
ul = UL(items, _id = 'my_item_list', _class = 'item_list')
my_div = DIV(ul, _class = 'container')

>>> my_div

<gluon.html.DIV object at 0x00000000039DEAC8>

>>> my_div.xml()
# I added the line breaks for clarity
<div class="container">
   <ul class="item_list" id="my_item_list">
      <li class="item" id="item_0">this</li>
      <li class="item" id="item_1">is</li>
      <li class="item" id="item_2">my</li>
      <li class="item" id="item_3">item</li>
      <li class="item" id="item_4">list</li>
   </ul>
</div>
Boa
  • 2,609
  • 1
  • 23
  • 38