95

In python, what is the most elegant way to generate HTML documents. I currently manually append all of the tags to a giant string, and write that to a file. Is there a more elegant way of doing this?

shoes
  • 1,003
  • 2
  • 9
  • 5
  • 3
    Possible duplicate of [python html generator](http://stackoverflow.com/questions/1548474/python-html-generator). If you're generating XHTML, also consider using an XML tool. – You Jul 19 '11 at 14:17
  • 1
    There are [several templating systems](http://wiki.python.org/moin/Templating) available for Python. Is that what you are looking for? – George Cummins Jul 19 '11 at 14:17

8 Answers8

78

You can use yattag to do this in an elegant way. FYI I'm the author of the library.

from yattag import Doc

doc, tag, text = Doc().tagtext()

with tag('html'):
    with tag('body'):
        with tag('p', id = 'main'):
            text('some text')
        with tag('a', href='/my-url'):
            text('some link')

result = doc.getvalue()

It reads like html, with the added benefit that you don't have to close tags.

John Smith Optional
  • 22,259
  • 12
  • 43
  • 61
39

I would suggest using one of the many template languages available for python, for example the one built into Django (you don't have to use the rest of Django to use its templating engine) - a google query should give you plenty of other alternative template implementations.

I find that learning a template library helps in so many ways - whenever you need to generate an e-mail, HTML page, text file or similar, you just write a template, load it with your template library, then let the template code create the finished product.

Here's some simple code to get you started:

#!/usr/bin/env python

from django.template import Template, Context
from django.conf import settings
settings.configure() # We have to do this to use django templates standalone - see
# http://stackoverflow.com/questions/98135/how-do-i-use-django-templates-without-the-rest-of-django

# Our template. Could just as easily be stored in a separate file
template = """
<html>
<head>
<title>Template {{ title }}</title>
</head>
<body>
Body with {{ mystring }}.
</body>
</html>
"""

t = Template(template)
c = Context({"title": "title from code",
             "mystring":"string from code"})
print t.render(c)

It's even simpler if you have templates on disk - check out the render_to_string function for django 1.7 that can load templates from disk from a predefined list of search paths, fill with data from a dictory and render to a string - all in one function call. (removed from django 1.8 on, see Engine.from_string for comparable action)

Carambakaracho
  • 305
  • 3
  • 12
Erik Forsberg
  • 4,819
  • 3
  • 27
  • 31
  • 8
    I thought of this, but I don't think it's exactly what the OP is asking for. It sounds like they want to build up the HTML itself programmatically, whereas a template assumes you already have the HTML but just need to fill in some variables. – Daniel Roseman Jul 19 '11 at 14:38
  • 2
    It sounds more like they have the content ready, and then need to paste html around the content. This is exactly what a templating engine is for. – Wilduck Jul 19 '11 at 14:43
  • 5
    Also, if you want a templating engine like the one in Django, use Jinja2. It's faster, more powerful, and is a standalone project. http://jinja.pocoo.org/docs/ – Wilduck Jul 19 '11 at 14:44
  • i'm working on a project where I need something exactly like this. I've inserted the code into PyScripter. How can I see the HTML output. Do I save it as a .py file or .html? Do I open it in my browser? – Anon May 02 '13 at 05:59
17

If you're building HTML documents than I highly suggest using a template system (like jinja2) as others have suggested. If you're in need of some low level generation of html bits (perhaps as an input to one of your templates), then the xml.etree package is a standard python package and might fit the bill nicely.

import sys
from xml.etree import ElementTree as ET

html = ET.Element('html')
body = ET.Element('body')
html.append(body)
div = ET.Element('div', attrib={'class': 'foo'})
body.append(div)
span = ET.Element('span', attrib={'class': 'bar'})
div.append(span)
span.text = "Hello World"

if sys.version_info < (3, 0, 0):
    # python 2
    ET.ElementTree(html).write(sys.stdout, encoding='utf-8',
                               method='html')
else:
    # python 3
    ET.ElementTree(html).write(sys.stdout, encoding='unicode',
                               method='html')

Prints the following:

<html><body><div class="foo"><span class="bar">Hello World</span></div></body></html>
Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
cheshirekow
  • 4,797
  • 6
  • 43
  • 47
  • 1
    The last line of your example fails for me with "TypeError: write() argument must be str, not bytes" unless I change it to "sys.stdout.write(ET.tostring(html).decode("utf-8"))" – Raúl Salinas-Monteagudo Mar 28 '19 at 11:12
  • 1
    @RaúlSalinas-Monteagudo: the original snippet worked for python 2 (tested on 2.7). I've updated it so that it should now also work for python 3 (tested on 3.5). – cheshirekow May 01 '19 at 22:03
12

There is also a nice, modern alternative: airium: https://pypi.org/project/airium/

from airium import Airium

a = Airium()

a('<!DOCTYPE html>')
with a.html(lang="pl"):
    with a.head():
        a.meta(charset="utf-8")
        a.title(_t="Airium example")

    with a.body():
        with a.h3(id="id23409231", klass='main_header'):
            a("Hello World.")

html = str(a) # casting to string extracts the value

print(html)

Prints such a string:

<!DOCTYPE html>
<html lang="pl">
  <head>
    <meta charset="utf-8" />
    <title>Airium example</title>
  </head>
  <body>
    <h3 id="id23409231" class="main_header">
      Hello World.
    </h3>
  </body>
</html>

The greatest advantage of airium is - it has also a reverse translator, that builds python code out of html string. If you wonder how to implement a given html snippet - the translator gives you the answer right away.

Its repository contains tests with example pages translated automatically with airium in: tests/documents. A good starting point (any existing tutorial) - is this one: tests/documents/w3_architects_example_original.html.py

  • I've used `yattag` (from the current highest voted answer) in the past, but I like this solution better because there is no implicit shared state between objects. – SethMMorton Dec 09 '20 at 03:42
  • Oh, yes, `yattag` is the previous generation of `airium`. I strongly recommend switching to `airium` because 1. It's nicer, 2. Generation of large documents is faster in `airium` (more efficient composer), 3. `airium` has its own transpiler, `yattag` not. – Mikaelblomkvistsson Dec 09 '20 at 14:22
8

I would recommend using xml.dom to do this.

http://docs.python.org/library/xml.dom.html

Read this manual page, it has methods for building up XML (and therefore XHTML). It makes all XML tasks far easier, including adding child nodes, document types, adding attributes, creating texts nodes. This should be able to assist you in the vast majority of things you will do to create HTML.

It is also very useful for analysing and processing existing xml documents.

Here is a tutorial that should help you with applying the syntax:

http://www.postneo.com/projects/pyxml/

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Sheik Yerbouti
  • 1,054
  • 1
  • 8
  • 14
  • 3
    HTML is not a subset of XML. If you're using an XML tool, you'll be generating XHTML, not HTML. – You Jul 19 '11 at 14:19
  • 1
    It's a serious lack that Python doesn't have a non-xml, html-specific (eg has methods like div(id='myid', otherattr='...'), ul() etc) version of this as standard (there are 3rd party ones). Perl and Ruby both do. – JDonner May 28 '12 at 00:00
3

I am using the code snippet known as throw_out_your_templates for some of my own projects:

https://github.com/tavisrudd/throw_out_your_templates

https://bitbucket.org/tavisrudd/throw-out-your-templates/src

Unfortunately, there is no pypi package for it and it's not part of any distribution as this is only meant as a proof-of-concept. I was also not able to find somebody who took the code and started maintaining it as an actual project. Nevertheless, I think it is worth a try even if it means that you have to ship your own copy of throw_out_your_templates.py with your code.

Similar to the suggestion to use yattag by John Smith Optional, this module does not require you to learn any templating language and also makes sure that you never forget to close tags or quote special characters. Everything stays written in Python. Here is an example of how to use it:

html(lang='en')[
  head[title['An example'], meta(charset='UTF-8')],
  body(onload='func_with_esc_args(1, "bar")')[
      div['Escaped chars: ', '< ', u'>', '&'],
      script(type='text/javascript')[
           'var lt_not_escaped = (1 < 2);',
           '\nvar escaped_cdata_close = "]]>";',
           '\nvar unescaped_ampersand = "&";'
          ],
      Comment('''
      not escaped "< & >"
      escaped: "-->"
      '''),
      div['some encoded bytes and the equivalent unicode:',
          '你好', unicode('你好', 'utf-8')],
      safe_unicode('<b>My surrounding b tags are not escaped</b>'),
      ]
  ]
josch
  • 6,716
  • 3
  • 41
  • 49
  • That would be quite interesting. Unfortunately it's quite antique: It's written in Python 2. I tried to port it to Python 3 but can't get it to work: It does not serialize the document wrappers like `HTML5Doc` :-( – Regis May Jan 18 '19 at 22:48
0

I wrote a simple wrapper for the lxml module (should work fine with xml as well) that makes tags for HTML/XML -esq documents.

Really, I liked the format of the answer by John Smith but I didn't want to install yet another module to accomplishing something that seemed so simple.

Example first, then the wrapper.

Example

from Tag import Tag


with Tag('html') as html:
    with Tag('body'):
        with Tag('div'):
            with Tag('span', attrib={'id': 'foo'}) as span:
                span.text = 'Hello, world!'
            with Tag('span', attrib={'id': 'bar'}) as span:
                span.text = 'This was an example!'

html.write('test_html.html')

Output:

<html><body><div><span id="foo">Hello, world!</span><span id="bar">This was an example!</span></div></body></html>

Output after some manual formatting:

<html>
    <body>
        <div>
            <span id="foo">Hello, world!</span>
            <span id="bar">This was an example!</span>
        </div>
    </body>
</html>

Wrapper

from dataclasses import dataclass, field
from lxml import etree


PARENT_TAG = None


@dataclass
class Tag:
    tag: str
    attrib: dict = field(default_factory=dict)
    parent: object = None
    _text: str = None

    @property
    def text(self):
        return self._text

    @text.setter
    def text(self, value):
        self._text = value
        self.element.text = value

    def __post_init__(self):
        self._make_element()
        self._append_to_parent()

    def write(self, filename):
        etree.ElementTree(self.element).write(filename)

    def _make_element(self):
        self.element = etree.Element(self.tag, attrib=self.attrib)

    def _append_to_parent(self):
        if self.parent is not None:
            self.parent.element.append(self.element)

    def __enter__(self):
        global PARENT_TAG
        if PARENT_TAG is not None:
            self.parent = PARENT_TAG
            self._append_to_parent()
        PARENT_TAG = self
        return self

    def __exit__(self, typ, value, traceback):
        global PARENT_TAG
        if PARENT_TAG is self:
            PARENT_TAG = self.parent

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Chris Collett
  • 1,074
  • 10
  • 15
-1

I am attempting to make an easier solution called PyperText

In Which you can do stuff like this:

from PyperText.html import Script
from PyperText.htmlButton import Button
#from PyperText.html{WIDGET} import WIDGET; ex from PyperText.htmlEntry import Entry; variations shared in file
myScript=Script("myfile.html")
myButton=Button()
myButton.setText("This is a button")
myScript.addWidget(myButton)
myScript.createAndWrite()