1

There is a draft specification that allows defining and using custom HTML elements.

Since this draft does not mention XHTML5 polyglot documents, and to my knowledge valid (as opposed to well-formed) XML documents require a DTD declaring all possible elements, does this mean that it is impossible to include custom elements in a HTML5 document that will also validate as XML?

Community
  • 1
  • 1
lxgr
  • 3,719
  • 7
  • 31
  • 46

1 Answers1

2

Given that XML validation happens against a DTD or a schema, but HTML5 allows user-defined elements and data-* attributes (and itself is a living standard anyway depending on whom you ask), your suspicions are most likely correct — these two things are incompatible. Granted one could go and write a DTD/schema that caters to their document by accounting for all custom elements and attributes, and it would certainly validate in the strictest sense of the term, but that's not quite how it works.

The good news is that, in polyglot markup, this won't be an issue. In section 3.1 of the polyglot markup specification, it says:

Polyglot markup results in:

  • a valid HTML document. [HTML5]
  • a well-formed XML document. [XML10]
  • identical DOMs when processed as HTML and when processed as XML, with some notable exceptions: HTML and XML parsers generate different DOMs for some xml (xml:lang, xml:space, and xml:base), xmlns (xmlns="" and xmlns:xlink=""), and xlink (such as xlink:href) attributes. XML requires and HTML5 permits these attributes in certain locations and the attributes are preserved by HTML parsers. The exception must not break the requirement to be a valid HTML document.

Polyglot Markup specifies a Robust Syntax, by which it is meant a syntax that maximizes support and minimizes authoring choice.

However:

Polyglot markup is not constrained:

  • to be valid XML. [XML10]
  • by conformance to any XML DTD.

This means that polyglot markup conforms to HTML5 by circumstance, but does not need to conform to any XML DTD in order to work. It is simply a serialization of HTML, and not an XML document type in and of itself. The concept of XML validation is in fact completely irrelevant to polyglot markup, just as XML validation is irrelevant to any XML document that doesn't declare conformance to any particular schema.

Community
  • 1
  • 1
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
  • It is not *necessary* for a polyglot document to be valid XML, but is it *possible* (with custom elements)? – lxgr Jul 12 '15 at 14:02
  • @lxgr: Updated my answer. I grok your question, but it doesn't seem that I can offer much beyond "your assertion is correct". Maybe someone else will be able to provide a more satisfactory answer. – BoltClock Jul 12 '15 at 14:26