5

I'm switching from XHTML 1.0 Strict to XHTML5, but I'm having issues with the default DOCTYPE declarations -- the documents no longer validate as XML, and cannot be loaded in some browsers when served as application/xhtml+xml with the .xhtml extension, mostly because of the entities like — etc.

I've tried just putting back the XHTML 1.0 Strict DOCTYPE and stuff, and it all works in the browsers as expected (no XML parsing errors, and the new article elements still work all right), but it no longer validates by W3 as valid HTML5, due to the meta charset specification from HTML5, for example.

How do I automatically import the entities to make the browser validate XML, but specify HTML5 for the W3 validator?

John
  • 1
  • 13
  • 98
  • 177
cnst
  • 25,870
  • 6
  • 90
  • 122
  • I'm not sure you can. The HTML5 spec contains [rules for user-agents](http://w3c.github.io/html/xhtml.html#parsing-xhtml-documents) (browsers) about how to resolve named character references based on the doctype public identifier, but it doesn't appear to apply to conformance checkers. Hopefully, @sideshowbarker (the validator maintainer) might happen along and be able to provide more insight. My view is that if you want to use and validate XHTML, give up on named character entities and just use the native characters, or numeric characters references. – Alohci Apr 11 '16 at 23:57
  • So, I've tried playing with the browsers, and it appears that some older Mozilla releases, for example, specifically expect "XHTML 1.0 Strict" in the doctype, otherwise, the errors appear. So, it sounds like it's the forward compatibility that's missing -- it should be possible to use the XHTML 1.0 Strict doctype for the legacy browsers, all the while having an extra one for the checkers to detect XHTML5. – cnst Apr 12 '16 at 03:27
  • Possible duplicate of [How do I define HTML entity references inside a valid XML document?](http://stackoverflow.com/questions/6508860/how-do-i-define-html-entity-references-inside-a-valid-xml-document) – Mr Lister Apr 12 '16 at 06:33
  • This is a known problem with XHTML5. I think it's because the W3C believes these entities are not necessary any more: you can simply type in any character you want. When this system was designed, 20 years ago, that wasn't nearly as easy as it is today. So that may be their reasoning: use — instead of —. It's still a real problem though. – Mr Lister Apr 12 '16 at 06:38
  • By the way, where you're using the XHTML doctype anyway, may I recommend using the XHTML 1.1 one instead of XHTML 1.0. XHTML 1.1 looks more like HTML5 to the validator, having new elements like and stuff. – Mr Lister Apr 12 '16 at 06:51
  • @MrLister, not a duplicate -- I'm getting the entity errors from within a browser after removing the XHTML 1.0 Strict doctype, not from an external parser like is the case with http://stackoverflow.com/questions/6508860/how-do-i-define-html-entity-references-inside-a-valid-xml-document that you mention. – cnst Apr 12 '16 at 17:27
  • Hm, yes, maybe [this](http://stackoverflow.com/questions/3215053/xhtml5-and-html4-character-entities) is a better duplicate. Or [this](http://stackoverflow.com/questions/24563300/parse-xhtml5-with-undefined-entities). Not [this](http://stackoverflow.com/questions/4053917/where-is-the-html5-document-type-definition), although that one's a good read. – Mr Lister Apr 12 '16 at 17:30

1 Answers1

1

Use numeric HTML entities such as & for ampersands for both the HTML and XML parsers instead of & which are not valid entities. Usually MDN (Mozilla Developer Network) has reliable information and I loath W3Schools so December has a fairly exhaustive list and I'd also highly recommend the Unicode Character Table website.

My platform and website (link in my profile) is served as XHTML5 and utilizes UTF8 characters in place of images (fewer HTTP requests for better performance).

As far as doctypes are concerned you didn't specifically mention which versions of which browsers so you'd need to comment in order for me to look in to it. I have installers going all thew way back to Opera 2 and Mozilla Suite 0.8. :-)

John
  • 1
  • 13
  • 98
  • 177