Currently there are 50 X-Philes. These comply to the following guidelines:
These days most people have problems with the first two, not to mention the last. Few people are aware that the W3C has additional guidelines, in the form of a W3C Note: XHTML Media Types. Though you would normally not need to comply with a W3C note, the things mentioned in the note make sense and don't give that much problems if you are a X-Phile already (IE is doomed, no matter what).
INHO every X-Phile should follow these additional guidelines. These are little things like:
<meta http-equiv="content-type" content="application/xhtml+xml;charset=utf-8"/>(I think you shouldn't use
http-equivat all, all
http-equivstuff can be done by the server (I know I use 1 at the moment, I'm going to fix that some day))
I think not everyone know what a processing instruction is (long for PI), I got some questions about it actually (I will try to explain more when I am writing about XML and such). Everyone knows this:
<?xml version="1.0" encoding="utf-8"?>. That is an example of a PI. You can recognize it by the '
<? starter'. One for referencing to a style sheets looks like this:
<?xml-stylesheet href="aural.css" type="text/css" media="aural"?>. Note that the character and version indicator PI should be topmost. PIs are defined in the XML recommendation: 2.6 Processing Instructions.
At the time when the X-Philes came, it was though. Now it is not that difficult anymore, since lots of people wrote about and told others how to do it. It is still great, but I think there should be some stars you can get if you pass more additional guidelines. Like:
Musings**. Valid XHTML1.1 plus MathML 2.0. Still the most technologically advanced weblog, ever.
(The follow-up post would be called, "X-Philes: Revolutions", with this idea implanted somehow, until then, this will be just another reference for non-X-Phile and X-Phile people :-))
Since I like asterisks, I agree :P PIEN: art and antiques already serves the
Content-type: application/xhtml+xml; charset="iso-8859-1"-header.
I'm not so sure about your 3rd guidline. InternetExplorer (which gets my pages sent as
text/html) coughs up a hairball when it encounters
For that reason (and for other less-capable browsers) I continue to prefer sending
<style ...> or
<link rel="stylesheet" ...>, which is perfectly legal XHTML (even though it may not be legal for other XML dialects).
On your larger point, it would be nice if all 50 X-Philes could still pass the test today. I'd worry about that before adding additional guidlines.
500 Can't connect to minzweb.de:80 (Timeout)
id-attribute of the
body-element, use the
p-element in your
blockquoteand you're done.
Well, damn it!
Seems, like I really got a problem with my hosting company.
I apologize, I had to hit the validate button 5 times before the validator was able to access the site.
Of course I IMMEDIATELY fixed Braintags after reading this list. All pages validate again, and I promise to watch out next time I change my templates again...
Hmmm, I am getting there it if weren't for Micro$oft IE choking on the
<object> element and Opera crashing with the DOM instructions I fed, one would be probably listed now.
I am all for using the;
<?xml-stylesheet...> once I've untangle the desolate XHTML file corpses I left in 1999 and perform necromancy to get them functional as an Application of XML. I agree in principle, since when I read both the XML and XSLT Recommendations during early 2000 it did cover such topics.
Anne, you're suggesting some people get "brownie points" for closely adhering to the XML 1.0 Technical Recommendations.
Maybe someone should start a sublist of sites that meet the tighter set of criteria outlined by Anne.
Robert, A list apart has a great article about valid
<object> embedding in XHTML documents, called Flash Satay.
I have a problem with sites that claim to be XHTML 1.1 compliant when they use MathML to circumvent the deprecated "style" attribute in XHTML 1.1. Yes, the W3C validator permits using XHTML 1.1 as a hosting language to host languages that break the XHTML 1.1 rules but come on this breaks the fundamental principle of separating data from formatting. **Please**, don't use inline styles if you use XHTML 1.1 in any fashion.
Thanks Asbjørn, I knew about such
<object> issues it's just that the browsers can be too finicky under certain circumstances regarding the said element, which makes life more fun.
OK, I see the point of requirements 1 and 2, since specifying
http-equiv meta tags is essentially a hack introduced to cater for people who didn't have any server access. I'm also pretty sure that there can be parsing issues with these; trying to hit as little (unpredicatble) error handling code as possible is obviously one of the reasons for going with well formed, valid, markup as opposed to the junk that most sites spew.
On the other hand, I fail to see any benefit in using PIs to reference stylesheets. Actually, that's a lie, I can imagine some generic XML parser, with no specific XHTML knowledge coping better with an XML PI than with
link. On the other hand, since the browsers that support such a construct are currentley in a minority, trying to support it would probably involve more hacks to present content to the less technically competent members of the browser community. Also, most XHTML style sheets rely heavilly on the cascade; authours make certian assumptions about how things will display in the absence of extra style information (from basic stuff, like "paragraphs are block level elements" to really specific stuff like "default font size is 16px"). Given this, it's pretty clear that a generic xml layout engine with no specific XHTML knowledge would do a bad job of laying out an XHTML page in any case.
Having said all of that, I agree that it's more aesthetically pleasing to use a generic mechanism for linking stylesheets rather than using langauge specific features. I just dispute that there's any practical benefit.
James (comment 9): Can you give an example? The only sites which I know use a +MathML doctype in the XPhiles list do so because they wish to embed Maths content (admittedly, my site has a +MathML doctype on the main page, but I haven't got around to adding any maths content because I haven't resolved the problems I was having with Itex2MML, not because I'm trying to use the style attribute).
Bear in mind, also, that "don't use the style attribute, under any circumstances, ever" is a dogma that many people disagree with, to the extent that it is being included in XHTML2. I tend to agree that most use of
style is an abuse caused by content creation software that has not adapted to the medium, but I can think of a very few instances where it would be useful.
Let's take Musings for example only because Anne mentioned them (sorry to single them out). They use the "style" attribute a lot and their Web site validates to XHTML 1.1 because of the inclusion of the MathML doctype. Just like Anne, I believe they should be commended or get "more stars" :-) if they use more Web standards. But if they want to use inline styles then they should not put the XHTML 1.1 button on their site. Yes, inline styles are useful in certain circumstances but it's a slippery slope. Inline CSS styles just like the font element, carry no semantics. If you have thousands of pages that use inline formatting and you need to reformat them, it is almost impossible to do this programmatically.
The Inline Style Module is part of the XHTML 1.1 DTD. It was not "added" to the
XHTML+MathML 2.0 DTD.
The Target Module, on the other hand, was added to the
XHTML 1.1 + MathML 2.0 DTD. I do use that on a few of my pages. The rest of my pages -- those without MathML content -- validate just fine as XHTML 1.1. I may be lazy in slapping an
XHTML 1.1 + MathML 2.0 DOCTYPE on all of my pages whether or not they actually contain MathML. But that laziness has nothing to do with my use of the inline style module.
As to whether there are valid use cases for inline styles...
If, for instance, you think you have an alternative to the methods outlined here which does not involve the use of an inline style, I'd love to hear about it.
Just to clarify my "DOCTYPE laziness": anyone commenting on one of my posts can use MathML in their comments. Upon rebuilding the page, I would have to figure out whether MathML was being used in the body of the post or in any of the associated comments and adjust the DOCTYPE of the page accordingly. This would be way too much trouble, for zero benefit.
How would you embed a SVG with a width of 7em and a height of 10em?
style attribute has it's uses. I do agree it should be removed, 'cause it isn't switchable and it is media specific
Especially the media specific part is bad, more bad, evil, since that was exactly the reason we needed style sheets for: Designing for multiple media (IMHO of course).
So is it agreed that, if we are to have lists of sites that comply to some set of standards, we should judge them by those standards and not by arbitary additional (aesthetic?) criteria that we place on top of those standards?
You just wan't your
link element back :-), but the proposed standard about media types says this:
When serving an XHTML document with this media type, authors SHOULD include the XML stylesheet processing instruction [XMLstyle] to associate style sheets.
IMHO it should read 'MUST' (referring to the same RFC), 'cause it is an XML document and there should not be to many exceptions.
The only problem there is with PI's for style sheets is that relative referring is not properly defined yet, but there is work going on to do so. Actually, I'm making test cases for that (in my free time).
So, with the XML Stylesheet PI(s), how do you prevent InternetExplorer from barfing? While I don't care that much about how my pages render in IE, I see no reason to gratuitously exclude IE users from even viewing them.
To stop M$ Internet Explorer choking; you would only apply the XML Stylesheet PI for the user-agents that could accept
application/xhtml+xml which would be the most logical resolution and probably still result in one gaining Anne's reloaded "brownie points".
Robert: So in this scheme, anyone not dynamically generating their pages is excluded? Should we also add in a requirement that pages be sent as valid text/html to non-xhtml browsers, including switching the doctype and changing all the /> to >? There are some browsers that will (quite correctly) display the / from /> (Sun's java-wrtten browser, for example). Certianly failing to do this breaks the spec (you're sending an XHTML document as text/html) and breaks compliant browsers. On the other hand, using a PI, as far as I know, has no positive effect on any current UA and (again as far as I know) isn't part of some exsting standard.
Anne: I have no particular desire to maintain my
link tags. I just don't think that making value judgements about certian pieces of legal markup, and publically rating websites according to that judgement s a helpful thing to do. As it is, the percentage of XHTML compliant pages is something like 10-4% of pages that google indexes. There's no need to add further levels of complexity to the process of producing compliant sites, particularly when the knowledge needed to tackle the extra complexity is held in weblogs rather than in specifcations.
<sup> doesn't seem to be working in your comments (Specifically, the text of my previous comment that reads 10-4% should have -4 as superscript). Moreover, you might like to put some of your comment guidelines near the actual comment box and make them less complex (I've just read them to see if .
<sup> is supposed to work). If you really want people to use classes on code blocks, for example, you'll need to provide an intuitive way of adding the classes, rather than relying on the user to type them.
Are you volunteering to improve my commenting system? :-) (
sup does work BTW)
I'm still not sure if sending a XHTML document as
text/html is equal to breaking the specification, though it quite evil.
Not sure if you already did know this: Associating Style Sheets with XML documents Version 1.0.
For dynamically generated pages it is not that hard to do. I send my pages as
application/xhtml+xml with the appropriate headers and stylesheet PI to those browsers that support it. Those who do not (read IE) will get
text/html (with appropriate
meta tag) and the stylesheet through the
link element. Trying to do this with static files would be harder, involving some server voodoo or such.
Ultimately, I think we should have the content originally as XML and then send it as whatever the client can handle (HTML, XHTML, WML or anything else) through server-side XSLT with all the appropriate bells and whistles.
Well, my site is not only accessible to browsers who send the
application/xhtml+xml header. Everyone else gets an error page. Now that I've cut off 95% of my potential audience, can safely play with whatever markup is in vouge at any given time, safe in the knowledge that any future visitors I alienate will be insignificant in number compared to those who have already been rejected.
Excuse me while I find a Prince CD, I need to party like it's 1999.
That "not" should obviously read "now".
I serve up static pages, and have no intention of changing.
Golem is a 3 year old, 466 MHz Macintosh G4 and, in addition to being my mailserver/webserver/..., it is the desktop machine upon which I occasionally try to do real work. I am not going to waste CPU cycles so that some XML-standards geek can receive my CSS stylesheets via XML Stylesheet PIs instead of via the perfectly valid and universally-supported
Where is the analogue of Hixie's famous article on the evils of serving XHTML as
text/html, which would explain to me in simpleton's terms what advantage accrues from using XML Stylesheet PIs instead of
I will revisit this decision when Anne restores the comment-preview page to his blog and starts serving both that and his search page as
For that matter, as a few of the X-Philes have shown, if you are going to serve dynamic pages, rather than just tinkering with the stylesheet directives, you could equally well convert the whole page and serve XHTML 1.1 to capable browsers and HTML 4.01 to the rest.
On the one hand, that would be the totally technically-correct thing to do ("Serving XHTML as text/html is evil!"). On the other hand, if your content can be perfectly losslessly translated into HTML 4.01, why even bother to serve it up in an alternate XHTML version?
Jacques: It's not quite true that you can't serve different static content to different browsers; apache's content type negotiation (e.g. using
MultiViews) would allow this. The main problem is that you'd need to generate two copies of each page, one XHTML and one HTML. I guess (without actually knowing) that you could hack MT to do this, but it seems to be a lot of effort for almost precisely* no gain. Even so, embedding other namespaces in html 4 is probably considered evil, so you'd have to dynamically convert all the SVG and equations to gifs (PNGs are probably even less evil, but they won't work with older browsers) with
alt text (actually, that might be an interesting challenge; is it possible to convert MathML automagically to a plain text form that is roughly equivilent to what a preson dictating the equation would say. Presumably, this is what an aural user agent is expected to do). Then you could consider yourself standards compliant. Well, until someone makes up some new rules (again) and you have to change everything (again).
* Unless you consider providing style information to generic XML parsers to be a substantial gain. No one has yet explain why anyone should care about this.
Multiviews. That would require hacking MovableType, as the latter has no concept of there being two (or N) individual archive pages for each post, two monthly archive pages for each month, and so forth. Not impossible, I suppose, but definitely requiring some heavy hacking of the code (in multiple places).
What's this fabled "generic" XML processor of which you speak? Styling information is needed only when you try to render XML, not when you merely wish to "parse" it. Where in the world is there an XML renderer that knows how to render XHTML, but does not know about the
Isn't the very existence of such a beast an oxymoron?
Actually, I'm not sure that Multiviews would cut it (though mod_rewrite certainly would). Consider the case of Camino. Dave Haas's Camino builds get served application/xhtml+xml. All other Camino builds get served text/html (lest they barf on the MathML content). This is not a matter of content-negotiation, as all these versions of Camino transmit the same "Accept" headers. So Multiviews won't help.
There is something really creepy about this discussion. Here I am discussing rewriting MovableType, doubling the number of static pages I create, and engaging in some heavy-duty mod_rewrite trickery. And I still don't know why.
Precisely how would all this work make the world a better place?
The idea of a generic XML renderer isn't quite so far fetched; Mozilla can render a made-up XML dialect using CSS styling. In fact, I think that's pretty much what it does with XHTML* except with a built in UA stylesheet that takes care of the expected defaults (paragraphs are display:block, spans are display:inline; and so on). XHTML also has an additional mechanism for linking stylesheets above and beyond the PI method provided for generic XML; the
link element. As far as I can tell, the entire point of this discussion is that it would be, in some sense, nicer to avoid using dialect specific methods where a generic XML solution is avaliable, since a XML UA might not pick up the stylesheet, and so wouldn't render the page as expected unless it had specific XHTML knowledge. (For this to actually work in practice, you'd need to link something like Mozilla's html.css into the page, which would add 10K of CSS and break the cascade). However, this isn't a problem that's specific to XHTML - how does one expect a generic XML renderer to deal with SVG, MathML or XSLT without inbuilt knowledge of the elements?
Therefore, I totally agree with you that in the real world, the benefit of using PIs is massively outweighed by the disdvantages (have a dynamically generated site, exclude all IE users, or create multiple copies of each page you serve). Unless anyone else has a better reason that PIs are a useful technology in the context of XHTML1.x?
* Obviously there are other differences between treatments of XHTML and generic XML in other areas; I think XHTML documents get a HTML-like DOM, for example.
Absolutely DOM would be an issue. Try serving your pages as text/xml, and notice how cookie-handling (for instance) breaks.
But, again, I don't see how you can render XML without knowing what the elements of that XML dialect are. How are you supposed to know what the CSS style attributes for a particular element mean if you don't have any knowledge of what the elements are?
If the concept of a "generic" XML renderer made sense, I'd recommend that y'all use that off-the-shelf renderer to whip together something that renders MathML. If only it were so easy...
Eventually parsers will have to support a set of standards and with those standards (which are probably defined by the W3C) they can render almost every existing XML document.
If you take a look around at the site of the W3C, you'll see that they are working on these (most of the time) massive specifications, which will get the job done.
Parsing is parsing and rendering is rendering.
No W3C specification is going to magically result in a general-purpose XML renderer which can render XHTML and SVG and MathML and ChemML and a thousand other XML dialects.
When do you think we will ever see this mythical (and, IMO oxymoronic) XML renderer which can render XHTML, but which does not understand the
I don't think there will be any futuristic parser that can render the current XHTML (1.x and 2.0), since it relies on a lot of specific things like the
href attribute and
I think that recommendations like HLink can solve these problems though it will take a lot of time before all those basic recommendations are there and correcty supported.
Jacques: I'm agreeng with you. I don't think that there will be a magic
xml renderer that will be able to deal with all dialetcs of XML. That is, however, the only plausible reason I can imagine for preferring generally unsupported PIs above universally supported
link elements. Don't be confused by the fact that I've shut off browsers that don't accept
application/xhml+xml from my site. That move is intended as a comment on the difficulty of following supposed best practice to the letter without pragmatism about what the standards are trying to acheive. When one has a readership that struggles to reach single figures, one can do such things :).
Anne: Suppose some XML parser/renderer did have support for all the 'generic' XML technologies such as XLink (or HLink, or whatever's winning that argument at the moment), CSS, XSL:FO, or whatever. You've still to explain how it would manage to render, say, a MathML document with a set of complex equations, without specific knowledge of (say)
<mfrac> elements. Essentially, that will be impossible until such a time as CSS is able to describe all the positioning possibilities of MathML (i.e. never). The situation is even worse with something like SVG, where the rendering is highly specific to the particular langauge. In an extreme example, one could imagine an xml-bitmap language with a form like:
<graphic> <row> <pixel red="255" green="0" blue="52" alpha="0" /> <pixel red="255" green="0" blue="52" alpha="0" /> <pixel red="255" green="0" blue="51" alpha="0" /> <pixel red="5" green="143" blue="22" alpha="0" /> </row>
there's no way that some generic XML renderer could extract useful information from that without a specific knowledge of the langauge properties.
Now it's true that (some future version of) XHTML might be a simple language, in the sense that the content model for each element in the spec is text, except in cases where the element belongs to a more generic XML class of elements (A HLink, or an XML PI, for example)., That might work, but it's certianly not the type of langauge we have now in XHTML1.x, or even XHTML2 - both of these require specific knowledge of the langauge semantics to parse and render documents effectivley. If this is the future you want, I suggest that you go and make the point on www-html and start campaining for a 'pure' XHTML3. Let's just hope that Microsoft hasn't replaced the web with some proprietry medium by 2050, or whenever XHTML3 is expected to be in general use.
To be clear, I have used the word "render" loosley (i.e. inconsistently) in previous comments. In the context of Mozilla being a "general XML renderer", I mean "render" in the sense of "display the text content of the document". It is conceviable that this definition of "render" might be good enough for some future version of XHTML - a language that may be rendered in this sense is what I referred to as a simple XML dialect.
Obviously the better use of the word "render" is "display in the intended form" - this is the type of rendering that no generic XML parser will be able to acheive since it relies on format specific semantics.
The (IMHO insignificant) advantage of using PIs for linking stylesheets is in facilitating rendering in the first sense. However, as Anne points out, using PIs will not help with other aspects of HTML which require specific language semantics such as replaced content (forms and images) and links.
I have nothing against XML stylesheet PIs per se. They have a useful role to play. Many XML dialects do not have a dialect-specific mechanism for associating stylesheets. So if you want to style a fooML document, or a fooML fragment within an XHTML document, you can and should do this with an XML Stylesheet PI. MathML is an example of this sort.
But if you're just implementing "standards" for the sole purpose of spiting your readers...
But if you're just implementing "standards" for the sole purpose of spiting your readers...
Was that aimed at me?
I've renabled access to my site for browsers that don't accept application/xhtml+xml. In fact it's a bit better now, since I'm checking the accept header rather than the User-Agent. I fear I failed to make the point I was aiming for. That point was supposed to be this: It is impossible to create a site that complies wth every nuance of every standard without shutting off your audience. In order to comply with one standard (send XHTML documents as application/xhtml+xml) fully, I have to prevent the majority of potential visitors from visiting my site. Without a sophisticated CMS, using PIs instead of link elements would require similar action. If I wanted to use content MathML rather than presentation MathML (they are both specified, but content MathML is supposed to be 'better' as-in 'more semantic'), then I'd have to cut off another group (Mozilla users). That's why I don't believe in having a stars system for extra standards compliance.
My problem, I think is lack of eloquence. Lack of eloquence and inaccuracy ("In fact, I think that's pretty much what it does with XHTML" - what was I thinking).
My two problems are lack of eloquence, and lack of accuracy. And verbosity.
My four problems are lack of eloquence, lack of accuracy and verbosity and an almost incomprehensible ability to be confusing
My four.. no... Amonst my problems are...
I'll start again :)"
No, James, it wasn't directed "at you", it was directed "with you". If adherence to "Standards" makes for a worse user experience, then one has clearly gone off on the wrong track in hewing to those standards.
Your "experiment" with
application/xhtml+xml was an extreme example (worse user experience = no user experience). But even this very comment form provides a more minor example of the same principle. Anne has disabled the "preview" button because he hasn't figured out how to serve up the resulting preview as
application/xhtml+xml . Apparently, serving up his pages with the "correct" MIME type takes precedence over the convenience of his readers, who would surely appreciate the ability to preview their comments before posting.
In the case of my blog, serving up comment previews as
application/xhtml+xml is a absolute necessity, as users can post comments containing MathML, and they need to be able to preview them. In his case ...
Other "X-Philes" have disabled HTML in comments or disabled comments entirely, simply to be able to ensure that their pages end up as valid XHTML. This seems backwards to me. Why would you wish to provide a more impoverished user experience, simply for the nebulous benefit of being able to claim a greater degree of "standards compliance" ?
Jacques, I can't speak for others, but I can tell you why I don't allow HTML in my comments. I think that I personally know most of my small readership, and I know for a fact that about half of them have no idea how to use HTML. They are, however, familiar with web-based bulletin boards, so I use an admittedly imperfect Movable Type plugin that I wrote to emulate the pseudo-HTML used by such boards*. Not a great solution, but it makes sense for my readers because they know how to use it, and I end up with valid XHTML markup** in my comments, because I can control what kind of HTML ends up on the page. My readers are happy, and I'm happy.
Thanks for reminding me, though, that I need to add the "remember me" cookie functionality back into my blog, which I removed a while ago because of some issue that I can't even recall now.
* I'm thinking of switching to MT-Textile, though, because it hews closely to the way people tend to "mark up" plain text with asterisks and underscores.
** As for my reasons for using XHTML, none of them are compelling. Initially, using the XHTML doctype and MIME-type were simply excuses to tinker with my site. That's still true, and now I'll add that it gives me a good excuse to learn XPath.
Using Textile, or some similar text-formatting plugin is a fine solution. Arguably, it is an enhancement of your users' experience over making them type raw HTML. Other people, with a more technically oriented audience, might find another solution to be optimal.
My point was that, in some cases, people are choosing between the interests of their readers and strict "adherence to standards". In such situations, I think it's perverse to choose the latter.
Jacques, just how do you serve up previews as application/xhtml+xml? It would be a great help to many of us. I allow previews in text/html simply because I know they're important, and can't figure it out in xhtml
I tried to find a contact email address for you via your site, but couldn't find it.
Check out his MT Howto's. One of them: Serve it Up!, is the one you are looking for.
For this, more technical, audience let me explain the reasoning of that article.
mod_rewriteto set the MIME type for those static pages.
application/xhtml+xml. You could just as easily use the ACCEPT headers, if that works for you.
application/xhtml+xmlright in the CGI scripts, I just check whether a certain environment variable is set. Then I use
mod_rewriteto set the environment variable, using the same logic as I use for my static pages.