Browsers Always Assume TBODY
In 2019, I noticed that I had this unpublished draft from 2011. I made some quick edits and then hit the figurative publish button.
Here’s a fun fact regarding web browsers and your markup: It does not matter if your HTML has a tbody
element. If you have a table
with at least one table row (tr
) that isn’t part of a header or footer section, you will get a tbody
in the DOM. Put another way, every tr
element will have a parent that is not table
, even though it’s perfectly valid to write HTML with tr
children of table
.
Haven’t heard of tbody
? It is a collection of some or all of the rows in a table. Each row must belong to the table’s header (thead
), to a tbody
, or to the footer (tfoot
). While an HTML document can only have one body
, an HTML table can have more than one tbody
.
The fact that table
elements get implied tbody
sections is nothing new, but it is rather easy to overlook.
I have confirmed this behavior in major browsers: IE 7/8/9, Firefox 3.6 and 6, Safari 5.1, Opera 11.51, Safari/iOS 4, Android 2.
HTML loves tbody
.
After spending a few minutes looking at the HTML5 section on tbody
and even the HTML 4 Table spec
it becomes clear that, in HTML, a tbody
is implied when the browser sees a tr
element that
is not in a thead
, tfoot
, or tbody
already.
That is, the only elements that can truly1 be direct children of a table
element are, in order:
caption
– optional.col
orcolgroup
– zero or more.thead
– optional.tfoot
– optional. Appears beforetbody
so the table footer can be rendered before entire table is downloadedtbody
– implied, if not explicit. one or more.
The tbody
opening and closing tags are optional, if that makes sense.
Your parser might not love tbody.
The sneakiness of tbody
can be a problem. At least a number of server-side parsing libraries aren’t tbody
-savvy.
One such library is Nokogiri, at least as compiled on my systems. (Nokogiri actually uses different XML parsers depending on where it’s used, which has got to introduce incredibly frustrating edge-case bugs).
This means that the DOM constructed by a tool like Nokogiri and the one constructed by a browser may differ. This is, in fact, how I discovered this behavior. While building Blogic, I was implementing functionality that allowed users to select parts of the DOM from an existing webpage to be removed or replaced in the creation of a template. The selection happened in the browser; our code described the element's position in the DOM; and then the template creation happened server-side based on that description. The browser saw a tbody
element where the server-side code didn’t, which was an interesting bug to track down and work around. (See work-arounds, below.)
Note: The above information was correct as of 2011. I do not know whether it is correct today.
You might not love tbody when writing CSS.
Another time the sneaky nature of tbody
elements might break your code is if you expect to be able to write CSS selectors like table.foo > td
(maybe you wish to avoid selecting nested tables' cells). Not gonna cut it. Consider table.foo > * > td
instead.
Work-Arounds
If you don’t want to be surprised by any of this, I would recommend manually defining thead
, tbody
, and/or tfoot
sections for your tables.
But if you, like I, am dealing with HTML generated by others, you’re just going to have to deal with it.
In my case, that is going to mean massaging the DOM tree that Nokogiri creates by
Creating a
tbody
for each table if none exists.Moving any
table > tr
elements into the table'stbody
element.
Note: Because tables can have multiple implied tbody
elements separated by an explicit tbody
element, the above algorithm is actually too simple and may cause table content re-ordering. Caveat emptor.
At this point, I started to wonder if Nokogiri and/or its component libraries at least create a ghost tr
if they see a td
directly in a table
… but I did not go down this particular rabbit hole. I will leave it as an exercise for a masochistic reader.
HTML. It sure has its surprises, doesn’t it?
-
What I mean by this is “in the DOM” as opposed to in HTML. If this does not make sense, think about it this way: Browsers construct, show, and allow interaction with a webpage, which is internally represented by the “DOM” (Document Object Model). HTML is a language to describe the initial state of the DOM that the browser will create. HTML allows some shortcuts. For example, you may not need to close your
p
tags. Omittingtbody
is one such shortcut. ↩︎