32

I don't understand the HTML5 specifications for the lang and xml:lang attributes of the opening <html> tag. Scrolling up a bit, I understand that xmlns is a "talisman" (has no effect), but what about lang and xml:lang? Should they be used? If so, what should they be set to?

pnuts
  • 58,317
  • 11
  • 87
  • 139
ma11hew28
  • 121,420
  • 116
  • 450
  • 651
  • 3
    I would have thought it's pretty clear from that document; "Authors must not use the lang attribute in the XML namespace on HTML elements in HTML documents", "Note: The attribute in no namespace with no prefix and with the literal localname "xml:lang" has no effect on language processing.", etc. (i.e. use `lang` but not `xml:lang` when dealing in HTML rather than XML) – Chris Morgan Dec 01 '10 at 00:35
  • Yeah, I think I finally understand that. I was having trouble cause I don't really know what a namespace or prefix is in that context. I'm assuming `` has neither, and thus, `xml:lang` has no effect. – ma11hew28 Dec 01 '10 at 00:51
  • 3
    @Chris Morgan - I don't think the document is clear at all. Thorough and precise, sure, but there's quite a lot of subtle stuff going on. Consider "Authors must not use the lang attribute in the XML namespace on HTML elements in HTML documents". It's impossible to actually do this with a HTML parser, it can only be done through scripting by using things like Document.createAttributeNS. Was that clear to you? – Alohci Dec 01 '10 at 01:20
  • 1
    @Alohci Reading HTML specs makes me cry. Just hearing about somebody else trying to interpret the specs makes me tear up in sympathy. In conclusion, I really appreciate plain english explanations like this one. – peteorpeter May 31 '12 at 15:26

3 Answers3

36

Everything I've seen and heard suggests that you should stick to

<!DOCTYPE html>
<html>
  <head>
    <meta charset='UTF-8'>

(or whatever character set you actually want). If you want a language associated with the page you can use the "lang" attribute on the <html> tag.

Since HTML5 is not XML, really, I personally would find it weird to use any xml: namespace stuff.

Pointy
  • 405,095
  • 59
  • 585
  • 614
  • 3
    Cool, thanks. Then, I'll go with , like [LinkedIn](http://www.linkedin.com/) does. – ma11hew28 Dec 01 '10 at 00:58
  • 4
    Oh, too bad that if you don't specify the xmlns attribute on html, it's not valid XHTML, and if you're aiming for polyglot markup that will render correctly when served as either HTML or XHTML, you need the xmlns attribute (because serving as application/xhtml+xml will otherwise display a document tree instead of a page). On the other hand, if you use the xmlns element in HTML5, the W3C validator will throw an error saying that the http-equiv in a is an invalid value, even though if you leave out the meta tag, it warns that you ought to add it to the document. – Triynko Aug 30 '11 at 15:39
  • 2
    Also, even though HTML5 is not XML, it supports XHTML-like syntax on void elements like `
    `, and it ends up putting everything in the XHTML namespace `http://www.w3.org/1999/xhtml` anyway.
    – Triynko Aug 30 '11 at 15:49
  • @Triynko Is there an updated namespace that includes the tags: `header`, `footer` and the like? – dlamblin Jan 15 '13 at 22:15
  • Shouldn't the `` tag be self-closing? Like `` or does that not matter for some reason? Also, HTML is just a superset of XML afaik. While it's not XML, it's based on XML. – crush Mar 15 '13 at 12:41
  • @crush there's no such thing as self-closing tags in HTML 5. The syntax is allowed but it's not semantically meaningful. – Pointy Mar 15 '13 at 13:47
  • @crush [see this related question](http://stackoverflow.com/questions/3558119/are-self-closing-tags-valid-in-html5) – Pointy Mar 15 '13 at 13:49
  • 5
    XHTML5 _is_ valid XML. And since there are uncounted XHTML parsers/processors that simply don't know what (X)HTML5 specifically is, it's _not_ wise to omit xml:lang (and xmlns) in polyglot syntax. – jh72de Jul 01 '13 at 10:34
17

xml:lang in the text/html serialization is just there to allow authors to write polyglot documents - documents that are valid XHTML5 and valid HTML5.

In HTML (as opposed to XHTML), xml:lang is not an attribute in the XML namespace at all, it's an attribute in the null namespace called xml:lang. i.e. the colon has no magic properties at all, it's just another character in the attribute name like any other.


To answer the question you originally had about en-US-x-hixie :

en-US-x-hixie is en-US (i.e. American English) plus a private use subtag -x-hixie meaning the variant of US English as written by Ian Hickson, the editor of HTML5.

Private Use Subtags are defined in at RFC: 5646, BCP 47 http://www.ietf.org/rfc/bcp/bcp47.txt Section 2.2.7. Private Use Subtags

Alohci
  • 78,296
  • 16
  • 112
  • 156
0

The lang attribute makes huge a difference on a html document when it comes to users that use a screen reader. So considering a11y you would definitely want to use it. This video is the best argument on this: https://youtu.be/0uzxu9dQnuU "Effect of lang attribute on JAWS speech". It shows how a screen reader will pronounce english text with spanish,french or german pronunciation (which is very hard to understand) just because the lang attr is set to those languages each time.

Also check : https://www.w3.org/International/questions/qa-lang-why.en where some good reasons mentioned are:

  • Styling (for example different fonts for different languages)
  • Spelling and grammar checkers
  • Translation tools
  • Search results (page internal markup can be used to improve the quality of them based on the user's linguistic preferences)
Nasia Makrygianni
  • 761
  • 1
  • 11
  • 19