Anne van Kesteren

Some notes on XHTML 1.1

A common misconception is that XHTML 1.1 is the latest version of the XHTML series. And although it was released a bit more than a year later then the first version of XHTML 1.0, the second edition is actually newer. Furthermore, XHTML 1.1 is not really the follow-up of XHTML 1.0; it just had to be named differently because it derived from the first edition of the Modularization of XHTML. This brings me to XHTML Basic, a standard for mobile phones and other devices also derived from the XHTML Modularization. XHTML Basic defines a subset of XHTML modules that make it not really useful in my opinion. After all, you could use XHTML 1.0 just as easy and let the mobile device ignore the elements it doesn't understand. If they are not implementing the SCRIPT element, they probably will not have a validating parser either, just like every desktop browser does not have a validating parser. Note that both specifications have not much to do with XHTML 1.0. There is a list of changes since 1.0 but they are not really superceding it.

Although for some people XHTML 1.1 seems a bit stricter because it doesn't allow the NAME attribute to be used as fragment identifier reference anymore it is actually more limiting. As it, for example, does not support the ID attribute on the HTML element, which can be useful in some projects. (Browsers do support it; the DTD just does not allow it.) And if you think the second edition of the XHTML Modularization of which XHTML 1.1 (Second Edition) will probably be derived fixes this, you are wrong.

Not that it is really important since XHTML 1.1 is not the follow-up of XHTML 1.0.

That and someone asked me why I was using XHTML. That was a mistake and will be fixed eventually. (Note also that my current publishing system only allows (not guaranteed well-formed) XHTML as output.)

Comments

  1. (Note also that my current publishing system only allows (not guaranteed well-formed) XHTML as output.)

    As I seem to remember that you use a hacked-up version of WordPress, there is an easy remedy for this. WP doesn't use any of the advanced things XHTML was invented for. So you can just as well slap an HTML doctype on it and remove any superfluous slashes with a regexp. And WP already uses text/html by default. Peanuts. :-)

    Posted by Ben at

  2. Ben: but then there always is the problem of upgrading... the sadness.

    Posted by Robbert Broersma at

  3. And that is exactly the reason why I am still using WordPress 1.2 instead of 1.2.2. Quite a few files are hacked and I did not documentate my changes. I did fix some things though, like the broken HTTP header.

    Now I am waiting PHP 5.0 support so I can use some other system.

    Posted by Anne at

  4. You could make a diff between the original version and your hacked version and another one between the old and the new version to see where the changes are.

    Posted by Ben at

  5. People usually place their ID’s-as-a-hook-for-styling on the body element anyways, so I would hardly call that a huge issue.

    Anyways, as it happens I’ve been looking a XHTML 1.1 a lot today, because at work we have created an extension of XHTML, and I was looking at making a Schema for it. In the end we cooked something up not based on XHTML 1.1’s Schema, because the purpose was to have Visual Studio’s Intellisense work and well, they’re Microsoft, so I should have expected the Schema implementation to be totally broken (right, the XHTML Schema didn't work at all and I had to use a number of illegal things to even just get our own tags working :)).

    Still, I think we need a Schema someday, for both code validation as well as code completion in editors like Eclipse w/XMLBuddy (which I use) and IntelliJ, so I’ll be revisiting XHTML 1.1 soon. At least with the VS.NET ‘schema’ we have a nice start to make a real XML Schema for our language.

    I have to say, doing this, I am *very* happy with the XHTML 1.1 modularization project, they basically made everything very simple (or at least as simple as it can get, I'd say ;p).

    About your imminent conversion to HTML - IMO, it’s a dead language. XML has just got so many possibilities. So I don’t see why you would bother, especially considering the fact that everything works perfectly well right now.

    ~Grauw

    Posted by Laurens Holst at

    1. You want id's on the <html> element? You want to use the target attribute or the <embed> element (to embed a quicktime movie)? Then the Modularization of XHTML is your friend. I can't really get much excited about XHTML 1.0. But XHTML 1.1 opens up a lot of possibilities.
    2. diff and patch are your friends. My patches to MT 3.1.x run to 511 lines of unified diffs (much less than it would be for WordPress, in which document structure and programming logic are hopelessly intertwined). Upgrading to MT 3.14 was a matter of minutes.

    Posted by Jacques Distler at

  6. Of course there are other reasons for using XHTML 1.1, for example to use Rubytext. I need that for my Japanese header in my weblog, but very few other people will find much use for that unless they have some Japanese (or similar language) on their site.

    I prefer XHTML 1.1 when sending as application/xhtml+xml since it sticks more to XML standards by removing the name attribute on certain elements. So for some reason it feels better. However, it sucks even more than XHTML 1.0 (first or second edition) when it is served as text/html (apart from it being MUST NOT by the W3C in some RFC).

    The id attribute is very unfortunate though. Unless you want Rubytext, XHTML 1.0 Second Edition is probably better than 1.1 when served as application/xhtml+xml.

    Posted by Charl van Niekerk at

  7. If you're really that sad that WordPress outputs XHTML by default and you want to use only HTML, why don't you just use ob_start to filter the XHTML through a regex that will make it HTML?

    Posted by Basil Crow at

  8. Basil: performance. It would probably double the time for generating this page.

    And because such method is just insane. And sad too. Can one possibly make WordPress even dirtier?

    No WordPress crique here, but well, it is no solution. It's just that for every regular, normal, expectable, logical, sensible thing you want to do (using a document format that can be interpreted by more that nine percent of browsers of the surfing population) with WordPress you have to use a dirty hack instead of possibilities whitin this publishing system.

    Posted by Robbert Broersma at

  9. Basil: performance. It would probably double the time for generating this page.

    No it wouldn't. And that's not the reason why you don't want to do it.

    The real reason is that parsing (converting) (X)HTML using regexps is evil! Far, far more evil than serving XHTML with the wrong MIME-type.

    While it's not particularly computationally-expensive, don't even think about converting XHTML to HTML using regexps. I will,however, grant you that doing a "real" XHTML to HTML4 conversion would be more computationally-expensive than one would want.

    WordPress clearly isn't the tool to (easily) emit HTML4. On the other hand emitting HTML4 in MovableType is pretty simple. Mark Pilgrim did it, Evan Goer does it, ... it just ain't that hard.

    I'm not convinced that "ability to emit HTML4" ought to rank terribly high on one's list of criteria for selecting a blogging tool, but there it is ...

    Posted by Jacques Distler at

  10. You want to use the target attribute or the <embed> element (to embed a quicktime movie)? Then the Modularization of XHTML is your friend.

    What would be the point? How would editing the DTD be useful?

    Posted by Henri Sivonen at

  11. What would be the point? How would editing the DTD be useful?

    So you can validate your document against a DTD which (correctly) describes its content. (Yes, I know all about the shortcomings of the W3C Validator, but it's better than friggin nothing.)

    There's value in having things which are deliberately present (the target attribute, the <embed> element), and otherwise used correctly, not flagged as errors (a single <embed> element is usually good for about 20 validation errors).

    Posted by Jacques Distler at

  12. There's value in having things which are deliberately present (the target attribute, the <embed> element), and otherwise used correctly, not flagged as errors (a single <embed> element is usually good for about 20 validation errors).

    Are there XHTML UAs that support embed but not object? Or are you suggesting serving the doc as text/html against the W3C Note?

    Posted by Henri Sivonen at

  13. Are there XHTML UAs that support embed but not object? Or are you suggesting serving the doc as text/html against the W3C Note?

    Both, actually.

    Consider Apple's suggested method for embedding a quicktime movie:

    <object classid="clsid:02BF25D5-8C17-4B23-BC80-D3488ABDDC6B"       width="160" height="144"      codebase="http://www.apple.com/qtactivex/qtplugin.cab">  <param name="src" value="sample.mov">  <param name="autoplay" value="false">  <param name="controller" value="true">   <embed src="sample.mov"      width="160" height="144"      autoplay="false" controller="true"      pluginspace="http://www.apple.com/quicktime/download/">   </embed> </object>

    There are a variety of UAs (Safari, in particular (IIRC), but basically any browser that does not support the Active-X style Quicktime plugin preferred by Internet Explorer) that will not function correctly if you omit the nested <embed> element.

    Moreover, I am perfectly comfortable (I do it with my weblog all the time) serving up the same page as application/xhtml+xml or as text/html, depending on the UA.

    I've experimented with other methods for serving quicktime content, without success (getting autoplay="false" to work is invariably the show-stopper). If you know how to do it across a range of modern browsers (even just across all XML- UAs), I'd love to hear about it.

    Posted by Jacques Distler at

  14. Hmmm. Your Wordpress installation is stripping out carriage-returns on POST. The <pre> element, that previewed OK, now looks like gibberish. Let me try again, this time with hard-coded line-breaks.

    <object classid="clsid:02BF25D5-8C17-4B23-BC80-D3488ABDDC6B" 
    width="160" height="144"
    codebase="http://www.apple.com/qtactivex/qtplugin.cab">
    <param name="src" value="sample.mov">
    <param name="autoplay" value="false">
    <param name="controller" value="true">
    <embed src="sample.mov"
    width="160" height="144"
    autoplay="false" controller="true"
    pluginspace="http://www.apple.com/quicktime/download/"> </embed>
    </object>

    Posted by Jacques Distler at