33

HTML5 doctype example.

Both IE9 and Chrome14 log TBODY as the tagName of the element inside the <table>

The HTML5 spec on <table> clearly states :

followed by either zero or more tbody elements or one or more tr elements

Furthermore. The HTML5 spec on <tr> clearly states :

As a child of a table element, after any caption, colgroup, and thead elements, but only if there are no tbody elements that are children of the table element.

Why are browsers corrupting my DOM and injecting a <tbody> when

  • I did not ask for one
  • It's perfectly valid without one

The answer of "backwards compatiblity" makes absolutely zero sense because I specifically opted in for a HTML5 doctype.

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
Raynos
  • 166,823
  • 56
  • 351
  • 396
  • Is this chrome specific or is it occurring in other major vendors? – Incognito Sep 20 '11 at 19:06
  • it's happening in all browsers. an answer might be found here: http://stackoverflow.com/questions/1678494/why-does-firebug-add-tbody-to-table – Eduard Luca Sep 20 '11 at 19:08
  • @jimbojw legacy code can use the HTML4 doctype – Raynos Sep 20 '11 at 19:08
  • How are you determining *injection*? By *inspecting* the element? JavaScript? – Jason McCreary Sep 20 '11 at 19:08
  • 1
    @Raynos I mean, the legacy browser code. The code that makes up the DOM parsing algorithms of our browsers. – jimbo Sep 20 '11 at 19:09
  • @JasonMcCreary I was using the chrome inspector. I'll be more specific. – Raynos Sep 20 '11 at 19:10
  • Possibly they (Chrome, other browsers) depend on it themselves for rendering purposes... in print mode? I agree it's stupid. – sg3s Sep 20 '11 at 19:11
  • @sg3s the point is, these browsers are being non compliant with HTML5. If this is the case it's a bug and will be raised as a bug – Raynos Sep 20 '11 at 19:13
  • @Raynos as I remember correctly thead and tbody were invented to be able to make headers for long tables, tables that could be split in print mode where the thead would accompany a part of the body on each new page... Don't know if that actually worked anywhere at some point. – sg3s Sep 20 '11 at 19:16
  • 2
    How is it non-compliant with HTML 5? The TBODY isn't required, as you note, but it is perfectly valid. – kindall Sep 20 '11 at 19:16
  • @kindall The browser has no business rendering my _valid html_ with a non-existance tbody element. – Raynos Sep 20 '11 at 19:16
  • @sg3s I know the reasoning behind all this. I just don't understand why browsers corrupt valid html into something else in the DOM. Why can't they render my html verbatim in the DOM. – Raynos Sep 20 '11 at 19:17
  • closely related: http://stackoverflow.com/questions/938083/why-do-browsers-insert-tbody-element-into-table-elements – Ciro Santilli OurBigBook.com Jul 20 '14 at 11:33

6 Answers6

41

The answer of "backwards compatiblity" makes absolutely zero sense because I specifically opted in for a HTML5 doctype.

However, browsers don't differentiate between versions of HTML. HTML documents with HTML5 doctype and with HTML4 doctype (with the small exception of HTML4 transitional doctype without URL in FPI) are parsed and rendered the same way.

I'll quote the relevant part of HTML5 parser description:

8.2.5.4.9 The "in table" insertion mode

...

A start tag whose tag name is one of: "td", "th", "tr"

Act as if a start tag token with the tag name "tbody" had been seen, then reprocess the current token.

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
duri
  • 14,991
  • 3
  • 44
  • 49
  • Your reference to the HTML5 parser is right on! I've written on this in slightly more length: [Browsers Always Assume TBODY](https://alanhogan.com/code/implied-tbody) – Alan H. Apr 14 '22 at 23:47
19

You're completely missing the part in the HTML5 spec that specifies how the tree is constructed.

The spec allows you to write a table without the tbody element as it's implied. Just like if you skip the html, head or body opening or closing tags, your page can still be correctly rendered.

I assume you'd like the DOM to contain a body for your content should it be left out for any reason. The same goes for tbody. It's added in because it explicitly assumes you forgot to add it yourself.

The rules for table parsing

A start tag whose tag name is one of: "td", "th", "tr"

Act as if a start tag token with the tag name "tbody" had been seen, then reprocess the current token.

zzzzBov
  • 174,988
  • 54
  • 320
  • 367
7

From my experience, browsers don't differentiate between HTML5 and HTML4 documents. They behave the same for both. The <!doctype html> doesn't trigger any special behavior in browsers.

And also <!doctype html> is not reserved for "HTML5 documents" - it's just the simplest possible doctype which triggers standards mode.

Šime Vidas
  • 182,163
  • 62
  • 281
  • 385
  • 4
    @Raynos I don't use the name "HTML5 document". Web-pages are HTML documents. Some web-pages use features which are defined for the first time in HTML5, some others don't. But they are all HTML documents. My response could be: What's the point of naming HTML documents HTML5 documents? – Šime Vidas Sep 20 '11 at 19:52
  • but we can explicitily define a document type as HTML3, HTML4, current. What's the point of doing that? – Raynos Sep 20 '11 at 19:58
  • @Raynos As far as I know, the only reason why a doctype should be defined is to trigger standards mode. Apart from that, the doctype serves no purpose. Again, as far as I know. – Šime Vidas Sep 20 '11 at 20:09
  • 3
    @Šime - For a *browser*, that's the only reason. *Validators* are different. – Alohci Sep 20 '11 at 20:59
5

Much of this comes about because HTML5 merges the successor to HTML 4 and XHTML 1.x into a single specification.

When XHTML 1.0 was introduced and browsers started to experiment with using an XML parser, they hit a problem. Authors were used to writing <table>s without <tbody>s. Since an XML parser isn't allowed to infer tags like HTML parsers did, the best way to help authors to transition to XHTML (which seemed like a good idea at the time) was to get the tables to render properly by allowing <tr>s to be the direct children of <table> inside the DOM. (The DOM is as much as possible the same, regardless of whether it originated from an HTML parse or an XML parse.) So the browsers implemented support for this.

Now the HTML5 content model is shared between the HTML and XHTML serializations of HTML5, so it has to allow for both arrangements, i.e. with or without tbody.

On the other hand, in the section on "The HTML Syntax" (which does not apply to the XML parser), it makes clear that an HTML parse will infer the tbody tags.

When <table><tr><td>my text</td></tr></table> is served as text/html the table structure created in the DOM will have the tr as a direct child of a tbody which is the direct child of the table. The HTML5 content model says this is OK.

When <table><tr><td>my text</td></tr></table> is served as application/xhtml+xml the table structure created in the DOM will have the tr as a direct child of the table. The HTML5 content model says this is also OK.

It is also possible to create a tr as a direct child of table through scripting. For the same reason, browsers will treat this as a table row as most people expect it to.

Alohci
  • 78,296
  • 16
  • 112
  • 156
  • Why does `text/html` inject a ``. Is there a good reason? – Raynos Sep 20 '11 at 21:40
  • @Raynos - That is, as others have pointed out, for backward compatibility. Many scripts and css selectors assume that browsers will continue to do what they've always done, and that there will be a tbody element there. – Alohci Sep 20 '11 at 21:49
  • but wouldn't those same scripts & css selectors break on `application/xhtml+xml` ? – Raynos Sep 20 '11 at 21:50
  • 3
    Yup. And library code often does. But XHTML is different enough that most authors who serve it as XML use different script and CSS to avoid those problems. – Alohci Sep 20 '11 at 21:52
3

This is for "historical reasons" (i.e. backwards compatibility, something which is very important to HTML):

For historical reasons, certain elements have extra restrictions beyond even the restrictions given by their content model.

A table element must not contain tr elements, even though these elements are technically allowed inside table elements according to the content models described in this specification. (If a tr element is put inside a table in the markup, it will in fact imply a tbody start tag before it.)

Please note that this quote is from the "HTML Syntax" section. This section applies only to documents, authoring tools, and markup generators and explicitly not to conformance checkers (which need to use the HTML parsing algorithm).

So: The specification says that using tr outside tbody is allowed as per the content model and the parsing specification, but anything that generates HTML (including YOU) should use tbody.

NikiC
  • 100,734
  • 37
  • 191
  • 225
  • That directly clashes with "As a child of a table element". Why does the WHATWG specification override what the HTML5 specification says? Why is there a contradiction? – Raynos Sep 20 '11 at 19:19
  • 1
    so could you explain how the "content model" and the DOM are different? (By editing the answer) – Raynos Sep 20 '11 at 19:24
1

Backwards compatibility is not just about the doctype, scripts might rely on a tbody element being there.

Matteo Riva
  • 24,728
  • 12
  • 72
  • 104
  • 3
    Don't include scripts that don't work with HTML5. If you opt into HTML5 don't use bad code. – Raynos Sep 20 '11 at 19:08
  • @Raynos while I do agree, browsers are designed for those who do bad things in HTML. –  Sep 20 '11 at 19:16