Anne van Kesteren

How easy is XHTML2?

19 October 2003

If we consider that almost nobody understands HTML4 correctly (although lots of people say they do) XHTML2 will be though. Before you actually understand everything and know where you need to use it for you have probably read the specification and a few other documents. Maybe it is not so bad it is not backwards compatible, though because of this, lots of things have to be learned (again?). Some people think object will replace img. Though this is partly through object has a much better fallback solution. longdesc isn't needed anymore. But there are also other things which could replace img. Some designers, who think XHTML2 is the next evil thing want to apply everything by using CSS2 background images. I think that the src and type attributes could be used for images. My header could look like this in XHTML2:

<h1 id="header" href="/" rel="bookmark" accesskey="1" src="/mt/img/logo.png" type="image/png">
   Weblog about <strong>Markup</strong> and <em>Style</em>
</h1>

Looks good, doesn't it? Just dropped 2 whole elements, thanks to the XHTML(2) attribute collections. I could still use CSS for the background pattern as I do know. Did we loss any semantic? Interesting question.. We dropped a, which has the semantic value of span, I would argue that that is a semantic win :). We also lost object, which was only used to embed the graphic and so has as much semantic value as a span or div (object can be either inline or block-level). The only things which could have been useful and are dropped are the height and width attributes. I'm not sure if that is a good or bad thing.

Well, that was just an example of good usage IMHO, but it also shows how complex you can make an XHTML2 document. You can embed style sheets everywhere you want them, besides having the style attribute and element. Also the link element doesn't disappear. I thought we had this recommendation for that: Associating Style Sheets with XML documents. Probably someone thought that there should be at least some backwards compatibility, which is crazy, 'cause older browsers can't use any new menu's using the href attribute directly on the li element (I know there is such thing we call XSLT, but nobody is going to write one for every static document).

Eventually some people will understand the basics, but probably mess up things, when it comes to MIME-types and scripting. Especially since I think they are trying to use some scripts they found at sites and the current Internet Explorer (which will not come) is going to support all those non-standard bull shit.

Probably the most difficult and the I not hope they are going to be in XHTML2 module is XForms 1.0. I actually hope an alternate draft is going to be accepted. I tried to understand XForms, but when I found something on the XForms mailing list (through Bugzilla) I think I'm not going to try to understand it all. To quote a part of the mail:

XForms has too many dependencies. In addition to XForms itself, an XForms implementation needs to support XML with namespaces, XML Schema, XPath, XML Events, DOM Events, DOM Core, CSS, a stylesheet linking technology (e.g. the XML Stylesheet PI), and a host language (e.g. XHTML or SVG). In particular, its dependency on XML Schema is of great concern to us.

Woo.. that is a lot actually. Especially XML Schema is a pain to understand (also to implant I think). I bought a book about it from O'reilly and it has some many options you don't want to know.. I must say I found an excellent tutorial on the Markup Forms page: XForms for HTML Authors. I read it and I understand XForms a lot better know, but it will never be as easy as the specification Ian Hickson wrote for Opera: XHTML Module: XForms Basic (URI can change).

I read it once and I understand it directly (well, most of it). I also submitted some feedback and now there is <input type="uri"/> :). It uses a lot of backwards compatible properties and introduces some new ones and one that already was used in modern browsers, but never made it into the standard: autocomplete. I think that if this draft becomes a W3C recommendation forms will be a lot easier to create and they have a lot more functionality for form creators.

(About XFrames, I tried to create something: XFrames test (send as text/plain), but the specification is far from finished to create something useful.)

Comments

Umh. I thought XHTML 2 dropped the <hn> tags in favour of a single <h> tag, whose "level" is determined by nesting <section> tags.
Personally, I think the <section> tag is going to be an extremely ironic challenge to "semantically-inclined" web designers. There are lots of documents which are naturally divided into sections and subsections and sub-subsections. But most web pages are not. So web designers adopting XHTML 2 are going to go to very amusing (and unsemantic) lengths to achieve the desired nesting of header tags, which they can achieve without all the unsemantic cruft using the current <hn> tags.
Posted by Jacques Distler at 1:30AM
Actually, according to the XHTML2 working draft:
There was a suggestion that h1 - h6 be deprecated. The working group has not yet addressed this suggestion.

Currently, both methods of applying headers are acceptable.
I also, however, agree about the <section> tags. They seem to be a rather counter-intuitive approach to enforcing the "don't skip header numbers" rule. Hopefully, this will get cleaned up in the future.
Posted by John Paul Taylor II at 6:05AM
I'm as into semantic goodness as the next standards geek, but as someone who likes to earn money for his work, I have a hard time seeing the real benefits of XHTML 2 for the immediate future. Not because there's anything particular good about the old way of doing things, but because there is a baseline of supported tags which can serve you going all the way back to v1.0 browsers. To use XHTML 2 accessibly is going to require a much longer consumer technology catch-up period than XHTML 1.0 Strict ever did. In the meantime, most of the nastiness of 90s presentational markup has been eliminated from my reportoire, and I can push the envelope for new browsers using CSS 2 and 3 without breaking basic accessibility in all browsers. I can see the point of getting rid of img and a tags in theory, but in practice how much do we really gain by this? Isn't it just semantic nit-picking? Is there even a well-defined border within which something is purely semantic and outside it is not? Is this all just an overreaction to the bastardization of HTML by Netscape and Microsoft during the browser wars? I'm sticking with XHTML 1.0 Strict for the foreseeable future.
Posted by Gabe at 8:36AM
Currently, both methods of applying headers are acceptable.

I hadn't noticed that <hn> had made their way back into the current working draft. Giving authors the ability to mix and match both <hn> and <h> tags in the same document should make for some quite "interesting" results.
Posted by Jacques Distler at 9:36AM
I must admit I was only trying to remove elements and not really checking if the elements I was using were correct. I think section and h are a good idea, but I'm curious if everything you want is possible (not looking at browser support, just CSS3 and XHTML2 and not layouts that are already impossible). I think there are some effects you can have now, but not in the future (from (XTHML2) semantic point of view the document must be perfect of course).
Gabe,
Do you know XSLT ;).
Posted by Anne at 1:48PM
I have to say I rather like the new header/section model. Giving something a header size just to make it look the way certain other headers do seems wrong to me; if it's not at the right depth, don't force it. Or use CSS.
The new model would solve a problem for me: how do you know what level of header a piece of content has? On the front page, there might be an h1 and an h2 (site title and section title, say), but an individual entries page might only have the title of the entry. How do you know what size of header to give subsections? It takes a fair bit of hacking to come up with a decent solution now. I don't want to have to think about it; I just want it to work and be semantic. XHTML 2 seems to enable that.
Posted by Gary F at 11:12PM
I don't see a problem with nested sections to get header levels. Or to be more precise, I don't see a problem we don't already have in the form of nested lists.
Frankly, having seen some writers who torture outlines into all kinds of laughable shapes, the section approach appeals to me: it would be harder to make the form wrong using sections than Hn tags.
Posted by Adam Rice at 5:14AM
Balls!
Here's a garden-variety snippet of XHTML 1.x which uses <hn> tags (from an email exchange I had with dave Shea on this topic):
```
<div id="intro"><h1>Title</h1><h2>Subheader</h2></div>
<div id="content">
	<h3>Content Header</h3>
	<div id="linkList"><h4>Link List header</h4></div>
</div>
```
Let's convert that to XHTML 2 <section> and <h> elements:
```
<section id="intro"><h>Title</h>
  <section id="extraIntroOrWhatever"><h>Subheader</h></section>
</section>
<section id="content">
  <section>
    <section>
      <h>Content Header</h>
      <section id="linkList"><h>List header</h></section>
    </section>
  </section>
</section>
```
Ugh! The incredibly unsemantic cruft that was required here is the result of the fact that the structure of the original document was not properly described in terms of nested sections and subsections. Many documents are well-described that way. Most web pages are not. You could cruft them up by rewriting them in XHTML 2, but it's not clear why you would want to do that.
P.S.: Anne! Get yourself a preview function, or I'm gonna stop posting comments here. It was too painful to write this blind.
Posted by Jacques Distler at 12:46PM
And even then, it didn't work right because I "forgot" that your comment system does not support the <pre> tag.
I give up!
Posted by Jacques Distler at 12:50PM
I fixed your code, but is this really the way of doing something? To me it just looks like your are replacing divs with sections, while I think you should only nest sections and do the styling with CSS.
I understand you need a whole bunch of quality selectors and properties to get the job done, but CSS3 offers us a lot, doesn't it?
Posted by Anne at 2:11PM
The nested <section>s are, in this case, unsemantic cruft. They are there to get the right level of <h> tags, but do not (otherwise) reflect anything about the structure of the document. Well ... OK ... the <div>s in the original example are resonably replaced by <section>s. It's the extra <section>s that are unsemantic cruft.
Posted by Jacques Distler at 7:43PM
The long-running <object> element debate was discussed in-depth within the HTML 4.01 Recommendations and XHTML 2.0 is certainly far from stable enough, to predict where it is going to end-up.
They're a few goods ideas though many backwards suggestions that are being proposed thus wrecking some of the progress made by XHTML 1.1.
Posted by Robert Wellock at 10:05PM

Jacques,
I you use it properly, you would end up with this:

<section id="intro">
 <h>Title</h>
  <section id="extraIntroOrWhatever">
   <h>Subheader</h>
    <section id="content">
     <h>Content Header</h>
      <section id="linkList">
       <h>List header</h>
      </section>
    </section>
  </section>
</section>

What you are trying to do is to duplicate your original example with section and h elements. That is unsemantic cruft, as you call it, but that is not where section is intented for.

Posted by Anne at 12:14AM

No, that version is worse. You have completely changed the logical structure of your document. The original document was divided into 'sections' (except they were called <div>s). That's not something you want to throw out the window just because you're trying to accomodate the new <h> tag.
The problem is that the original document just doesn't follow the model of document structure implicit in these new tags. Rewriting the document using these tags is a matter of trying to fit a square peg into a round hole.
Posted by Jacques Distler at 3:21AM
You are right, I hadn't really thought it through I think. I see what you mean now.
Though I'm not sure what the benefit is of making those things seperate sections... Has this been discussed on www-html?
Posted by Anne at 6:56PM
Dunno. But I think this is (one of) the reason(s) why people are lobbying to keep the <hn> tags.
There was nothing bad about replacing some of the "structural" <div>s in the original example with <section>s. Indeed, you might call that something of a semantic improvement. The trouble came in assuming that header levels alway naturally correspond to nested subsection level. That's true for some documents, false for others.
Posted by Jacques Distler at 8:20PM
What's with all this <h> and <section> nesting? Maybe the idea is that we should rethink how our documents are structured in the first place and come up with some kind of simplifications for that to replace the expectations involving nested sections and headings. For example, we could use definition lists for content (now that XHTML2 supports things like paragraphs within list elements) rather than outlining each paragraph with a new section and heading.
It's been a while since I read the XHTML2 spec, though, so I may be wrong.
Posted by Rahul at 8:49PM
Eek! You want to subvert definition lists to replace headings and paragraphs?? What ever happened to the idea that definition lists should be ... umh .... for definitions?
If you want to do that, there's nothing stopping you from doing it in XHTML 1.x. You can perfectly well have <p>s inside of <dd>s today. (It's the converse --- having lists inside of paragraphs --- that is new to XHTML 2.)
Posted by Jacques Distler at 9:24PM
<hn> and h elements coexist peacefully in the XML that underpins tagged accessible PDF. Of course, nobody uses them.
Posted by Joe Clark at 2:27AM
Actually, I was suggesting that we use an XHTML2 comparable tag rather than the actual current model for definition lists.
As for the lists-inside-elements thing, I was referring mainly to using paragraphs within <li>.
In any case, I do agree that the XHTML2 spec is a little dodgy-looking at the moment. But if we really start rethinking the structure of our documents more as documents than nested <div>s and wrappers, then perhaps it'll make more sense...
Posted by Rahul at 5:15AM
As for the lists-inside-elements thing, I was referring mainly to using paragraphs within <li>.

What about them? Works for me.
In any case, I do agree that the XHTML2 spec is a little dodgy-looking at the moment.

I wouldn't call it "dodgy" per se. I just don't see how it fulfils any burning need of web-designers (unlike, say, CSS3, which has many folks justifiably excited). As has been endlessly pointed out, very few people actually need XHTML 1.x. (Ie, you could apply an XSLT transformation to their pages and serve them up to user agents as HTML 4.01 and no one would notice the difference.) And if they don't really need XHTML 1.x ...
But, not being a web designer, I could have totally misjudged the matter.
Posted by Jacques Distler at 9:52AM
I don't see the problem with nesting sections to form the levels of headings.
Remember that the HTML spec always advised against jumping from <h1> to <h3>, and all this new form of sectioning does is puts some more effort into trying to stop the author doing that.
Of course it's still possible to do it by nesting a section immediately inside a section, and the only way to prevent that is to make headings compulsory when using sections (not entirely a bad idea either, because if you don't want a heading, is it even a section?)
Posted by Trejkaz at 9:38AM