17

So which one to start with, HTML or XHTML? I am a beginner and wants to have solid foundations of markup language but as I started learning I found some people use HTML and some XHTML.

Jasmine Appelblad
  • 1,554
  • 3
  • 22
  • 37
  • Well as a beginner I don't have any clue of my own which one to choose between HTMl or XHTML and majority of developers voted for XHTML so I'll go with XHTML. – Jasmine Appelblad Jan 01 '10 at 20:34
  • 12
    If you're a beginner you should use nothing other than HTML 4. XHTML is a fad that resulted from the XML fervour of 5-10 years ago. XHTML 2 died in fact. XHTML 5 is nothing more than a token gesture to placate those using XML for HTML and not recommended for use by the W3C. Despite the answer you've chosen there is very little reason to use it. – cletus Jan 01 '10 at 20:45
  • @cletus: HTML5 is a draft. Neither the HTML nor the XHTML serializations of HTML5 are recommended for use by the W3C. – Alohci Jan 02 '10 at 01:32
  • @Alohci: It is not recommended to use complete HTML5, but you can quite safely add some aspects of it into your code – Casebash Jul 07 '10 at 10:19
  • @Casebash. Indeed. In both HTML and XHTML serializations. – Alohci Jul 07 '10 at 12:06

8 Answers8

24

Conventional wisdom has come sort of full circle on this point. Back in like 2002 everyone was gung-ho for XHTML but many people (including myself) didn't have good reasons why. It was just the cool new thing and everyone jumped on the bandwagon, started putting XHTML in their resume skills instead of just HTML which looked so plain and unimpressive.

What's happening now is, with HTML5 finished, people are starting to realize that there's nothing wrong with good old fashioned HTML. It's the language of the web. Here's the pros and cons of XHTML as I see them:

Pro

  • Allows you to embed non-xhtml XML into your web page, such as an SVG element. This isn't possible with plain HTML.
  • Allows you to easily parse your documents with an XML parser, which could obviate the need for hpricot or BeautifulSoup if say, you wanted to replace all H1 tags with H2 tags in your website templates.

Con

  • IE doesn't understand the 'application/xhtml+xml' mime type, so as far as it's concerned you're sending malformed HTML.
  • It's a little more verbose. <br> and <table cellspacing=0 cellpadding=0> is neater looking, in my opinion, than <br /> and <table cellspacing="0" cellpadding="0">.

There must be some advantages to XHTML that I'm missing, but I myself use HTML for everything these days.

jpsimons
  • 27,382
  • 3
  • 35
  • 45
  • I don't know about that. SGML requires quotes, but I always thought HTML didn't. http://dev.w3.org/html5/markup/syntax.html#syntax-attributes – jpsimons Jan 01 '10 at 20:57
  • 1
    Yes. cletus is wrong. In general, using attributes without quotes is valid HTML. As it happens, I think quoting attributes is neater, but it's very much a personal preference and other reasonable people may disagree. – Alohci Jan 02 '10 at 01:23
  • @Alohci: Doesn't strict HTML require quotes which means it is more a matter of future compatibility than personal preference? – Casebash Jul 07 '10 at 10:24
  • @Casebash - No. HTML 4 didn't require it, and neither does HTML 5 in the text/html syntax. Easy to check. Copy test

    Test

    into http://validator.w3.org/#validate_by_input and run. Even if a future HTML did make it a conformance requirement, HTML parsers (e.g. in browsers) would still need to support omitted quotes to be able to process huge swathes of the web. There is no future compatibility issue. XHTML served correctly as application/xhtml+xml does require them, of course.
    – Alohci Jul 07 '10 at 12:01
15

XHTML is only useful if you want to autogenerate/manage/validate/etc the HTML code with help of a XML based tool, such as a component based MVC framework (e.g. Sun JSF, Apache Struts, Microsoft ASP.NET, etc) or with XSLT. Parsing/formatting HTML programmatically is trickier than XML, because HTML allows here and there non-closing tags, e.g. <br>. XML is much easier to parse/format programmatically because it is required to be well-formed.

If you're just starting and/or hand-writing "plain vanilla" HTML, I would recommend to use HTML 4.01 elements with a HTML5 doctype. There's really no need to massage the HTML code into a XML format.

<!DOCTYPE html>
<html lang="en">
    <head>
        <title>Page title</title>
    </head>
    <body>
        <h1>Heading</h1>
        <p>Paragraph</p>
    </body>
</html>

The HTML 5 elements aren't widely supported yet, hence the recommendation to stick with HTML 4.01 elements. The HTML 5 doctype triggers the standards mode in most of the browsers, including IE6. The other benefit of HTML5 is that it allows closing shorttags like in XHTML. Also see HTML5 spec chapter 3.2.2:

Authors may optionally choose to use this same syntax for void elements in the HTML syntax as well. Some authors also choose to include whitespace before the slash, however this is not necessary. (Using whitespace in that fashion is a convention inherited from the compatibility guidelines in XHTML 1.0, Appendix C.)

Basically, even if you write pure XHTML, using <!DOCTYPE html> would still make it valid (and trigger webbrowsers in the correct standards mode).

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
12

XHTML is pretty much like HTML but non-sloppy. I really can't think of a reason besides laziness not to use it.

David Hedlund
  • 128,221
  • 31
  • 203
  • 222
  • 16
    Someone does not agree: http://hixie.ch/advocacy/xhtml – ntd Jan 01 '10 at 20:24
  • 2
    Most if not all highly seasoned developers would not agree with you, novice types, well yes they'd surely agree. – DoctorLouie Jan 01 '10 at 20:26
  • 1
    Here's a reason why you shouldn't really use XHTML: http://stackoverflow.com/questions/1770193/ies-xhtml-compatibility – DoctorLouie Jan 01 '10 at 20:31
  • 1
    @ntd, he just argues that sending xhtml as text/html may be a bad thing, it doesn't actually say -using- xhtml is bad. – Daniel Sloof Jan 01 '10 at 20:33
  • 14
    @Daniel: actually that's exactly what he says. He explains that XHTML as text/html is bad and IE6 doesn't support XHTML+HTML. You lose several conveniences with XHTML (like HTML entities; XHTML entities are different). There is very little reason to use XHTML. So much so that XHTML 2 died and XHTML 5 exists only for compatibility reasons for those already using XHTML. – cletus Jan 01 '10 at 20:41
  • 8
    HTML is a well-defined language just like XHTML. Nothing sloppy about it. Either one can be implemented with mistakes and become invalid "tag soup." – jpsimons Jan 01 '10 at 21:19
  • @cletus not just IE6. No version of ie supports it. – Alex Jasmin Jan 01 '10 at 22:07
  • HTML5. There's a reason right there. – roryf Jul 07 '10 at 12:31
  • 1
    Redundancy is not a proof of lack of laziness. – Danubian Sailor May 23 '12 at 07:55
3

When it comes to learning on or the other, there's really rather little between them. XHTML is essentially a subset of HTML that encourages (or rather requires) stricter standards -- specifically, it's an application of the XML standard to HTML. As such, any valid XHTML is also valid HTML (for the most part at least).

In my opinion, the distinction between XHTML and HTML isn't really that important. What is important, however, is to write consistent and efficient markup, and this is what the XHTML standard was designed to encourage. It doesn't matter whether you label you code as XHTML or HTML, just as long as it's well-written.

The main feature of XHTML is simply that it requires a high standard of quality in your code, but this is something you should be doing anyway in HTML.

Will Vousden
  • 32,488
  • 9
  • 84
  • 95
  • 1
    "As such, any valid XHTML is also valid HTML". No it isn't. Prior to HTML5, it is impossible to create a single document that is both valid XHTML and valid HTML. XHTML must include a namespace declaration attribute, and that would be an invalid attribute in HTML. – Alohci Jan 02 '10 at 01:56
  • 1
    Hence "for the most part at least". For practical purposes (and certainly for a beginner), it's still reasonable to say that XHTML is a subset of HTML. – Will Vousden Jan 02 '10 at 02:44
  • 2
    Sorry, I still don't agree. A construct like
    will do one thing in XHTML and something different when treated as HTML. To describe XHTML as a subset of HTML is to deny that, and thereby cause confusion in beginners. I prefer to describe HTML and XHTML as sibling languages, with a common vocabulary but different syntaxes.
    – Alohci Jan 02 '10 at 03:24
3

XHTML is for a-type people who think XML looks more "neat" than plain-ol HTML.

But really, it doesn't matter that much. You can switch from using one to another faster than it would take you to get some lunch.

Brian Ortiz
  • 1,821
  • 1
  • 20
  • 47
1

HTML 4.01 would be your best bet since learning in stages would allow you to see a clearer picture of whats really happening behind the scenes and deep within the markup. Once you have a clear view and lengthy understanding of the HTML 4.01 you can then move to XHTML 1.0.

DoctorLouie
  • 2,699
  • 1
  • 18
  • 23
  • 3
    So they should learn the poor habits that HTML allows then move to the better standard? – ChaosPandion Jan 01 '10 at 20:23
  • 1
    Think of it anyway you'd like, the reality of the matter is if you don't know where we are coming from you wont be well suited to know where we are going. – DoctorLouie Jan 01 '10 at 20:28
  • 1
    What "poor habits" does HTML allow? – jpsimons Jan 01 '10 at 20:44
  • 4
    HTML 4 doesn't teach poor habits. It teaches the markup that has the widest possible support. XHTML is a self-indulgent and pointless distraction 99% of the time. – cletus Jan 01 '10 at 20:46
  • @mastermind: can you explain why HTML 4.01 is better for understanding what's happening behind the scenes? I think most people would picture the tree-structured DOM better as XML than as HTML where some tags are often omitted. To a beginner, the determination of which tags can be omitted seems very arbitrary. – Alohci Jan 02 '10 at 01:49
  • The most important thing is to use validation, even as a beginner – Casebash Jul 07 '10 at 10:28
1

Start with HTML, but use a validator. In HTML5, everyone seems to be focusing on the HTML, rather than the XHTML serialisation.

  • As I explain in my answer here, the designers of XML wanted to enforce higher coding standards and making parsing easier, but that only works if almost everybody switches. Instead of relying on your browser to enforce code quality, rely on validation.
  • Due to limited XHTML support in Internet Explorer <=8, pretty much everybody serves XHTML as text/html. This effectively restricts you to a subset of HTML and XHTM and requires you to follow compatibility guidelines. You could choose what format to serve based on user agent instead, but this is messy.

Given the limited advantages, I strongly recommend HTML, especially if you are a beginner.

Community
  • 1
  • 1
Casebash
  • 114,675
  • 90
  • 247
  • 350
0

HTML and XHTML are the same language, with slightly different syntaxes. Once you know one, you know the other.

It really doesn’t matter.

Paul D. Waite
  • 96,640
  • 56
  • 199
  • 270
  • It isn't super important to get this right for a beginner, but even so going with HTML will be easier for a beginner – Casebash Jul 07 '10 at 10:27
  • “HTML will be easier for a beginner” — how? – Paul D. Waite Jul 07 '10 at 11:17
  • 1
    @Paul: Dual serving based on the accept header is messy. XHTML served as HTML requires you to follow [compatibility guidelines](http://www.w3.org/TR/xhtml-media-types/#compatGuidelines). Either option is quite complex – Casebash Jul 07 '10 at 12:27
  • @Casebash: I guess the compatibility guidelines would be confusing if you’re coming from an XML background. If not, I don’t think you’d even notice them. – Paul D. Waite Jul 07 '10 at 22:43
  • @Paul: Are you saying that we should teach XHTML without teaching XML? – Casebash Jul 07 '10 at 23:15
  • 1
    @Casebach: Depends really. If you’re actually going to be using XHTML as XML at any point, then learning XML would help. But if you’re just writing XHTML for the web, then you’re not actually writing XML, so I reckon you could quite happily skip XML knowledge. And really, what is there to XML itself (i.e. ignoring any specific XML languages)? It’s just a set of syntax rules, right? – Paul D. Waite Jul 08 '10 at 15:00