Anne van Kesteren

DOCTYPEs

9 February 2005

A while ago A List Apart published two articles which were related to each other in a way. One about validating custom attributes and elements and one about javascript triggers. Both articles rise many interesting discussions and questions. For example:

Are we allowed to extend XHTML 1.0 to add an attribute?
Is it useful to extend XHTML on your own?
What are browsers doing with custom DTDs?
Is validation even useful when you are adding some attributes for scripting purposes?
Can we not use the CLASS and perhaps the ID attribute for those extra needs?

Peter-Paul addresses some of these questions in "JavaScript triggers" — wrapping it up. Personally, I do not think the validation thing in this regard should be taken that serious. DOCTYPEs can be seen as a tool, letting you switch between different rendering modes. It does not really matter if your custom REQUIRED attribute will validate. You know it is there and what its purpose is. You know what causes the single error the validator is complaining about. You know on what elements it will have affect and on what elements it does not. So as long as you document that for the team you are working with you are fine. And you can just use the DOCTYPE to make sure you trigger standards mode in Safari, Opera and Mozilla. (And perhaps even in Internet Explorer 6, if you do not have a comment before it.)

In the end, however, it is much more useful when there is no need for custom attributes. When specifications address that what authors need you do not need REQUIRED because it is already there. That also means you get a native implementation in the browser which enables a lot more options than your custom implementation does.

Back to the wrap up of Peter-Paul Koch. I do not quite get what he means with the movement to XML. I am personally in the opinion that XML provides a meta language and that once XHTML will be supported we will use that. The reason we will be using XHTML and not our custom set of elements and attribute is obvious. Semantics are very important and since there is no way to define semantics for custom elements you need a language that has got semantics and is widely deployed to make sure other people and software (Google) understand them. XHTML is such a language. (Actually, it only is such a language when served as text/html so it becomes (invalid) HTML again. But eventually when bots and such will support XML and namespaces XHTML will be such a language as well.)

Eventually we do not even need DOCTYPEs anymore. When you are using XHTML with an application/xhtml+xml MIME type the generated DOM will look exactly like you created it. There are no differences since there does not exist something like "tag soup" in XHTML. While the generated DOM in HTML for a P element nested inside an A element can turn out to be different than you thought in XHTML that will not be the case. (Or should I say XML?) Same for new attributes and elements of course.

Comments

- Perhaps a bit offtopic, so two questions:
  1. How are mimetypes handled if a file is read from the filesystem instead of http?
  2. Why isn't the xmlns attribute used?
Posted by J van Velzen at 11:49PM
Custom elements and semantics
My 'custom' elements are not made up by me, with my private semantics. (If so, I could use the class attribute and get valid pages.)
My problem is that my 'custom' elements do have defined semantics, but those semantics are defined in other namespaces than xhtml's.
And my blog provider is enforcing validation, making custom extensions to xhtml a real pain.
Posted by Jan Egil Kristiansen at 11:49PM
J, browsers are guessing unless the underlying OS provides a way to provide the information. Basically, if you have a .xhtml file Mozilla will treat it as application/xhtml+xml. Likewise for .html, treated as text/html. Not sure why people are not using namespaces if that is what you are referring to. Probably because support for those in text/html documents sucks. (And that is not really a browser problem, since HTML does not know the concept of namespaces.)
Jan, you should really be using XML Schema or RelaxNG. DTDs are not really namespace aware and such.
Posted by Anne at 12:05AM
Standards mode can be tirgerred in IE6 by unknown docype (this is true for Mozilla to some extent, but see Mozilla's DOCTYPE sniffing.
Mozilla also uses standards mode for some MIME types, despite the DOCTYPE.
Posted by Rimantas at 12:13AM
Content-authors should not have to wade through Validation errors to determine which are "OK" ('cuz we're using proprietary attributes) and which are to be corrected.
If you want to make having a DTD useful (it's not used at all for DOCTYPE switching for application/xhtml+xml documents, at least not in Gecko), then it should accurately describe the document it's attached to (including any proprietary extensions). Otherwise, there's no point in having a DOCTYPE declaration at all.
Posted by Jacques Distler at 12:42AM
If you decide to use proprietary extensions, the responsible thing to do would be to describe those extensions programmatically.
Not sure what the controversy here is, unless it's over using proprietary extensions in the first place.
Posted by Evan at 2:16AM
We sould not extend XHTML. Thats the work of the W3C. If we want to use ower own languages, we sould use XSLT to transform it to valid XHTML.
Posted by Alita at 2:27AM
Jacques, I should probably have pointed out that DOCTYPE switching is indeed only useful in text/html context. And that is therefore is useful to just use one while not caring about validation.
Posted by Anne at 2:46AM
Alita: aaahhh, so that’s why XHTML is an abbreviation of eXtensible HyperText Markup Language!! ^__^
No kidding though, with regards to semantics I think Anne is absolutely right that if you are going to define stuff yourself, not according to any standard but your own, semantics are lost. However, that doesn’t mean you shouldn’t extend XHTML... I’ve done just that on one of my pages to make the tables sortable using a g:sort="yes" (or asc or desc) attribute. I don’t think, at all, that that is a bad thing which reduces the semantic value of the page.
Besides, isn’t it better that if people are going to create something of their own, they do it with XHTML as a basis (which at least makes it partially understandable from a semantics POV), instead of doing everything over from scratch?
~Grauw
Posted by Laurens Holst at 8:54AM
Oh, I found a nice section in the XHTML 1.0 specification: 3.1.2. Using XHTML with other namespaces. It seems to acknowledge that when you use different namespaces, the DTD can be problematic, and says work to address the issue is in progress. So I guess that means that even the spec agrees that it’s reasonable to have small additions to the document break the validation...?
~Grauw
Posted by Laurens Holst at 5:12PM
The whole point of XHTML 1.1 was to make it modular, so that — unlike XHTML 1.0, which you cite — it is possible to (relatively easily) extend the XHTML 1.1 DTD to describe your additions.
If you're gonna start adding extensions to your XHTML documents, you really ought to be doing it in XHTML 1.1.
Posted by Jacques Distler at 11:51PM
Jaques, unfortunately, the modularisation of XHTML into XML Schema isn’t finished yet. DTD is a mess with regard to namespaces, and is hard to generate using techniques such as XSLT, so I’m really waiting for the W3C to finish up their Modularisation into XML Schema right now.
~Grauw
Posted by Laurens Holst at 8:58AM
Yes, you can use the elements from the XHTML namespace in your own languages. But on a public website you sould use webstandards and your own language is not such standard. A DTD or a XML Schema is needed to ensure your site is standard compliant.
And do not extend the XHTML namespace. You are not the owner of this namespace.
Posted by Alita at 5:13PM