5

I am trying to use Python's ElementTree to generate an XHTML file.

However, the ElementTree.Element() just lets me create a single tag (e.g., HTML). I need to create some sort of a virtual root or whatever it is called so that I can put the various , DOCTYPES, etc.

How do I do that? Thanks

Gareth Simpson
  • 36,943
  • 12
  • 47
  • 50
Uri
  • 88,451
  • 51
  • 221
  • 321

2 Answers2

7

I don't know if there's a better way but I've seen this done:

Create the base document as a string:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html></html>

Then parse that string to start your new document.

Gareth Simpson
  • 36,943
  • 12
  • 47
  • 50
  • 2
    +1, confirmed, elementtree cannot add/create doctype (but can parse it!), so this solution is as clean as it gets. – Alex Martelli Jul 01 '09 at 19:23
  • 1
    It appear that this does not work any more: `import xml.etree.ElementTree as ET; string = ''' \n''';print(ET.tostring(ET.fromstring(string), encoding='unicode'))` emits only `` – Matteo Gamboz Dec 07 '20 at 18:04
1

I have had the same problem. When parsing a document and writing the docuemnt back again the doc-type definition is not present anymore. I found a solution browsing the documentation:

link text

Saving HTML Files #

To save a plain HTML file, just write out the tree.

tree.write("outfile.htm")

This works well, as long as the file doesn’t contain any embedded SCRIPT or STYLE tags.

If you want, you can add a DTD reference to the beginning of the file:

file = open("outfile.htm", "wb")
file.write(DTD + "\n")
tree.write(file)
file.close()
Stephan
  • 3,679
  • 3
  • 25
  • 42
  • Using Python 3.6, this does not work out-of-the-box but can be easily fixed. One must open the file with `"wb"` instead of `"w"` and use `.encode()` on the prefix string as `file.write` expects a binary parameter with `"wb"`. – Zoltan Jan 15 '21 at 14:25