Anne van Kesteren

A hole in HTML

29 March 2004

There has been quite a lot of discussion if a DIV contains semantics or not. Ian Hickson:

Any time a div is the answer, there's a hole in HTML.

(Unrelated question: when a markup language is to be used in combination with XHTML (so we are not talking about MathML, which doesn't really need XHTML) should it have its own namespace or is it possible within the lines of extensibility to extend the XHTML namespace?)

Comments

Is it all that important whether or not a DIV contains semantics? The element is used to divide one distinct section from another. That behavior has several possible applications, some of which have semantic significance, and some of which have not.
Usually, a DIV element is used to provide a CSS styling hook. No real semantics there. But sometimes the element is used to distinguish between different languages (with the lang attribute), or to provide places to attach unique identifiers (with the id attribute). Both of these applications provide an injection of semantics, do they not?
Are you suggesting that the DIV element is unecessary?
Posted by Simon Jessey at 11:02PM
Her we go again ;]
[I'm conviced, that even divs used for css have a semantic, which may be hard to be put in words, but for some reason, so put together a set of elements, which have something in common. Otherwise the would be presented together..]
But anyway, the important question is, what shall we do against these holes?
Of course HTML has holes, but we can't go on fixing them everytime we find one, because we don't want to have a new HTML-Recommendation every week.
So until we feel, that HTML is too inapropriate for Website, and something new has to come, we should keep on working with these [not too beautifull] holefixes.
OT: I personally hate Namespaces. They are hell if you work as an XSLT-Programmer.
Posted by ben at 11:42PM
Bring on XHTML 2.0 nope that's not the answer; destroy all HTML Browsers then life will be XML based bliss (cough-cough) that's not going to happen overnight either...
I blame the current legacy browser market share.
Posted by Robert Wellock at 12:59AM
Simon Jessey: Are you suggesting that the DIV element is unecessary? In the end, yes. Nowadays, certainly not. Even XHTML 2.0 doesn't solve all our problems concerning markup. Heck, not a single markup language solves those.
The point is that we need multiple markup languages, since HTML hasn't enough semantics to desribe all possible forms of use, like web applications or mathematical web pages.
Posted by Anne at 1:28AM
It's ridiculous to believe that HTML could ever be semantically 'complete'. It needs to strike a balance between being complete enough for most cases and being simple enough to use, especially in the internet = marketing world we're living in.
Having elements without semantics is essential to make sure that authors can add the structure they need to documents (so they can style them) without adding meaningless semantics.
Posted by Menno at 2:08AM
As others have said, (X)HTML can never provide 100% semantics for all possible fields of use. That's why neutral containers such as <div> and <span> are so useful. They indicate holes, yes, but they plug them nicely.
And really, how much semantics do we need? The semantic web may need a lot, but the browsers do not.
Posted by TOOLman at 2:08PM
He's flat out wrong. The DIV element's semantics are about separation or "division" of content sections. It's a grouping element, according to the spec. If you're using a DIV to group content, then a DIV is the answer -- and is the correct answer according to the spec.
I'm pretty sure DIV stands for "division", which means it does have semantic value too, and isn't overly generic.
Even in the context in which he makes that statement, I think he's wrong. A DIV should be used in that case, 'cause he is separating/dividing/grouping a certain section of content from the rest.
Posted by Devon at 4:00PM
I’m pretty sure DIV stands for “division”, which means it does have semantic value too, and isn’t overly generic.

Maybe section would have been a better term, but then we would be talking about SECs all the time ;)
Posted by Menno at 6:02PM
Well, SECTION has been proposed for XHTML 2.0. And in that specification, it has a different meaning than DIV has now.
Posted by Anne at 6:12PM
It feels like I've statem my mind on this issue in this very blog already, multiple times...
So, I don't have much to add on that issue. I have something to say about semantics, XHTML and XML, however. My first comment would be that XML can impossibly cover more than a narrow specrum of semantics because of it's single parent model. A simple example would be a table, whether by the HTML model or some other model. Something as simple as having a cell be a decendant of both it's row and it's column. You cannot possibly achieve it in XML. Another example would be the limits DTDs place on any language described by them. The first would be that you can't have two structurally different elements with hte same name, while that name would be a logical choice, semantically. Another problem is that you can't have delegated content models. For example, you can't have the following:
```
<allowcontentmodelaonly>   <elementwithsemanticmodelthatshouldnotlimititscontentmodel>     <contentsbelongingtomodela>   </elementwithsemanticmodelthatshouldnotlimititscontentmodel> </allowcontentmodelaonly>  <allowcontentmodelbonly>   <elementwithsemanticmodelthatshouldnotlimititscontentmodel>     <contentsbelongingtomodelb>   </elementwithsemanticmodelthatshouldnotlimititscontentmodel> </allowcontentmodelbonly>
```
Where the contents of elementwithsemanticmodelthatshouldnotlimititscontentmodel elements are limited to that of the parent, but where elementwithsemanticmodelthatshouldnotlimititscontentmodel is not itself limited by that content model. Neither of these problems is that very great, but I can see them limiting me. Especially that containers can not overlap. On the other hand, I can't think of an equally simple model as XML for allowing that.
Posted by liorean at 9:34PM
The table model is complex indeed. About DTDs and multiple element having the same name: that issue has been "fixed". Multiple elements having the same name is possible using namespaces (svg:script and xhtml:script for example) and describing those can be done using the 800 pounds heavy XML Schema (the "800 pound" thing comes from a little book called 'XPath and XPointer').
Posted by Anne at 10:00PM
No, no no. What I meant was just that - elements with the same name.
script ≠ svg:script ≠ html:script
Sure, the local name is the same, but the qualified name is different. In the DTD, you specify qualified names, and you can't use the same qualified name for more than one element, despite their semantic meaning in different contexts being a little different, and that because of that they need another set of attributes or another content model. No, you need to make sure that they don't use the same name, by for example using namespaces, or simply adding a contextual part.
Again, look at tables. There's really no reason for having the name thead instead of simply head, except for restrictions put by the DTD syntax. The t is redundant, as the fact that it's enclosed in a table element already tells us the same information. We know already that it's not the same element as the head that is the first element of the html element, because of the context.
Posted by liorean at 12:08AM
It is possible to use HEAD for multiple purposes. The problem is that you don't want to describe semantics based on element relations. In HTML, you can do this, since it is "required" to create a valid document so every relation created between elements is "fixed" in the DTD. You could say that when HEAD is a child of HTML it will have different semantics than when HEAD is a child of TABLE (with CSS you can create the presentational difference).
The problem is that those fixed relationships between elements are not always there and I would have to knoow more about the markup language before I know the exact semantics a element can have.
What will tell you more about an element: THEAD or just HEAD (is this a fair comparison?). From THEAD I can tell that it has something to do with tables, because of the T-prefix. If HEAD was a child of TABLE I couldn't tell that. I would have to know more about the document language.
Furthermore, in XML, documents have to be well-formed, not valid. This means that I could make the HEAD element the root element and import that document using XInclude into another. How would someone know looking at that document, which HEAD element we look at? (It just pops into my head that they have probably different children, but that is not always the case, I hope.)
Posted by Anne at 1:53PM
Furthermore, in XML, documents have to be well-formed, not valid. This means that I could make the HEAD element the root element
There is no particular constraint imposed by validating systems that would stop you.
```
<!DOCTYPE samp PUBLIC "-//W3C//DTD HTML 4.01//EN"> <samp/Hello, world!/
```
Posted by EBB at 2:25AM