1

Possible Duplicate:
Do I really need to encode '&' as ' &' ?

I know that W3C recommends to "use "&" (ASCII decimal 38) instead of "&" to avoid confusion with the beginning of a character reference", and that tidy warns of an "unescaped & which should be written as &" but, these warnings not withstanding, is it valid to write (say) "Tiffany & Co." instead of the more cautios "Tiffany & Co."?

I know the example above renders as expected (i.e., the same) in all browsers I've tried, but this is a syntax question for the language lawyers out there. It looks to me that, as long as the expression is not of the form " &blah; ", it should be legitimate. (Same with using angle brackets if you don't write " <blah> ".)

As an aside, the fact that skipping </li>'s is not deemed a warning, while unescaped &'s are, speaks volumes of the inconsistency of HTML syntax. In fact, if we're going to rely on the parser's smarts, leaving out a </ul> (for instance) should be OK if the (unfinished) list is the very last part of the document: no </ul>, no </body>, no </html>... no possible ambiguity, right?

PS: I just found out that (sensible) unescaped angle brackets yield no warnings in tidy, but are detected as errors by W3C's validator, while unescaped ampersands give warnings in both places. Go figure...

Community
  • 1
  • 1
ezequiel-garzon
  • 3,047
  • 6
  • 29
  • 33
  • In SGML and HTML5, a free ampersand is allowable when followed by a character that isn't valid in a NAME token. So `a & b` is OK and you'll only get a warning, whereas `a&b` is a proper error. Neither is allowable in XHTML and it's certainly a bad idea ever to rely on this. – bobince Sep 25 '10 at 13:43

1 Answers1

4

Invalid? Yes. A free ampersand inside HTML markup will result in a parser error.

Can modern browsers ignore the error and parse correctly? Most likely.

But it is malformed HTML.

Yuval Adam
  • 161,610
  • 92
  • 305
  • 395
  • Thanks for your reply. W3C's validator only gives a warning, though an error is issued for free angle brackets. It won't ruin my day, but all of this is highly inconsistent, wouldn't you say? Also, the word "should" instead of "must" in standards is sometimes considered an option. By the way, even good ol' lynx gets these (unambiguous) examples right. – ezequiel-garzon Sep 25 '10 at 12:39
  • 1
    You're dealing with HTML. Browsers practically invented the word "inconsistent" when it comes to dealing with HTML. – Yuval Adam Sep 25 '10 at 13:16