Anne van Kesteren

Website checklist

18 March 2005

Sometimes somebody asks me to check some site. Since I always focus on the same points I thought it might be nice to put them here and save myself some time. (Although I may not point to this place in the end, as when I have some spare time I like to do it.) My checklist is pretty basic and excludes design although I might say a word or two about it when it is really ugly. Anyway, that list:

Semantic markup – commenting on things like class="invisible", class="red-top-border", tables for layout, too many DIV elements when I see that less must be possible. Not using a list for navigation, using too many CLASS attributes because the word inheritance was unbeknownst to them. Well you know, the usual stuff. I wrote something about it in Dutch: Structurering: Het draait om semantiek. And part of my English interview was about this. (I also really dislike inappropriate META elements. Especially those with a HTTP-EQUIV attribute.)
Character encoding – really simple, I want it to be UTF-8. That article covers quite some of the advantages. However, “On the charset parameter of the Content-Type header” and some of the notes here do too. Using something else as character encoding can be ok (you might want to include some Klingon), but it is not recommended and mostly not very useful.
Markup languages – HTML for the user interface, generic XML or XHTML for the backend and Atom for syndication. (Conversions are best to be done using XSLT.) You might want to have RSS 0.91 or 0.92 if you need to support legacy feed readers. Other RSS versions are not really fine, although 1.1 looks promising.
MIME types – I discussed this some time ago in “MIME types you should use.” Please take into consideration that text/xml is evil. In short, use text/html and application/xml. You might want use some specific XML MIME types for syndication but you can sort that out yourself. (I'm leaving javascript and CSS aside here, seems obvious.)
IRI design – IRIs are a superset of URIs. (More.) I have written some things on this subject before (twice) and I will try to summarize that here. You should make sure that they keep working and don’t break as that harms user experience. You can ensure that by not relying on IDs that are generated by the database. You should avoid extensions when possible, but provide them if you are using content negotiation so people can pick the PDF version when they desire. You should avoid ? from appearing in your IRIs and you should make sure they are usable and user friendly by making things like /contact just work. (It is not that you should expect the user to guess certain files or directories, but that they can remember it and that search engines can extract meaning from them by comparing the IRI with other bits of the page, like the title and such.)
Validation – I do not really care about that. It might be useful to do for debugging display problems though. Enabling standards mode probably helps.

(You might want this to be a definition list, but it isn’t.)

Comments

Let us be thankful that priority differentiation is not yet banned, and variety is welcomed, which implies that there is no single ,shining path, the only politically correct path.
Posted by Moose at 9:54PM
Exactly those points most 'webdesign' companies really don't care about. And there are a lot of those, so we still have enough to check.
Not using a list for navigation

'General · Learning · Thoughts' isn't a list for navigation? :)
Posted by Krijn Hoetmer at 10:02PM
Plus print stylesheet (useful one) and friendly URLs. There is more in my list, but those are more pedantic.
Posted by Rimantas at 10:24PM
Rimantas, you are right about IRIs, I should add them to the list.
Posted by Anne at 10:41PM
(You might want this to be a definition list, but it isn’t.)

Hmmm. Don't terms depend on their context? Isn't this entry context enough to see it as a list of definitions? Or are you saying <dl> is for universal truths only?
As i see it, it does depend on context, heavily. No offense, but you don't need to state it's not as others see it, it's obvious you've got another opinion on this due to the way you executed your action, and this boils down to ones point of view anyway.
A nice thing to think about, isn't it? :)
Posted by Michael Zeltner at 10:48PM
Validation – I do not really care about that. It might be useful to do for debugging display problems though. Enabling standards mode probably helps.

Why not?!? I'm confused about that one.
Posted by Turnip at 11:41PM
Why not?!? I'm confused about that one.

Because validation is there to help you. Sometimes you can't specify the doctype just because IE breaks some things in standards compliance mode.
Posted by Patrys at 2:24AM
BTW: Sometimes validators compain about perfectly fine ECMA code (and no, I don't mean "document.write()").
The things I also check include lowercase tags, proper code indentation (there is nothing worse than a whole document contained in a single line - then I must depend totally on the wonderful Firefox extension called WebDev) and elasticity (to make sure that everything won't collapse as soon as someone changes one letter in a single paragraph).
Posted by Patrys at 2:32AM
I quite agree with all the points you have raised there.
Regarding validation, I agree with you. But I think that if one can make a site validate (which if one codes to the HTML recommendation for front-end documents in a proper way, then it should do by default), then all the better. But as Patrys says, its just a guide. It's not really essential that it validates.
Nice article Anne
Posted by Chris Poole at 3:45AM
Regarding clean IRI's: What is the best way to accomplish this when using PHP? How exactly do you accomplish this on this site?
I have yet to find a good tutorial or article on this subject that seems reliable and easy to understand. Perhaps you or someone else could direct me to one?
Posted by Matt at 5:20AM
Matt: It's usually done with mod_rewrite and an Apache file called .htaccess. If you search for either on Google you are sure to get plenty of information. Basically mod_rewrite allows you to map something like http://example.com/page/about/ to http://example.com/page.php?id=about
Everyone else: There is a difference between writing valid code and code that validates. Perhaps that's where we are confused. I would strongly support writing valid code (as in code that abides by standards), but I'm not too bothered about the validator saying it's valid (particularly with CSS).
Posted by Turnip at 5:32AM
If everyone would post his website checklist on his blog, the web would be a better place.
In my checklist aka Web Design Canon, I also nag about document titles and the use of relative IRIs with absolute paths etc, I'm really into that kind of stuff.
Posted by Mathias Bynens at 5:33AM
Matt - people usually map the other way round :P (but i'm sure you knew that hehe).
my little rant on this stuff
Posted by Chris Poole at 8:02AM
You might want to say "You should avoid ? from appearing in your IRIs unless they are the result of a GET..."
I also agree with Turnip that code should be valid (though one should ignore a broken validator). After all, if a page is invalid, then doesn't that mean it has no well-defined semantics?
Posted by dolphinling at 8:51AM
Eeh, let me revise that last comment.
You might want to say "You should avoid ? from appearing in your IRIs unless they are the result of a (GET) form submission..."
P.S.: Why can't we use <ins>?
Posted by dolphinling at 10:16AM
Patrys wrote:
BTW: Sometimes validators compain about perfectly fine ECMA code (and no, I don't mean "document.write()").

Markup validators don't validate ECMAScript, they don't understand it. However, they do validate the content of the script element, and as such, it must conform to the constraints of the markup language, namely the SGML or XML rules. This issue is explained by the WDG, though it talks about document.write(). However, if this isn't the same problem you encounter, please explain.
Turnip wrote:
There is a difference between writing valid code and code that validates. Perhaps that's where we are confused...

What? Please explain this difference. If you write valid code, then it must validate or else you have written valid code.
... I would strongly support writing valid code (as in code that abides by standards), but I'm not too bothered about the validator saying it's valid (particularly with CSS).

In order to write markup that conforms to the standards, it must validate, as it is a conformance contstraint. If the validator complains about it, then it is clearly not valid. However, you do have a point with CSS, but such errors should only occur with the use of properties with a vendor prefix. Other errors should be fixed.
Posted by Lachlan Hunt at 11:59AM
Lachlan, and what if I mix certain CSS2 specific properties with certain CSS2.1 specific values into a single stylesheet? (The properties are and values are not bound to each of course, but just exist in a single stylesheet.) You can not validate that at the moment. Nor can you validate certain things of CSS3.
You can also not validate certain extensions from Web Forms 2 or validate some embedded RDF usage in your XHTML. Et cetera. As long as you know what you are doing, validation is not really important.
Posted by Anne at 6:41PM
what if I mix certain CSS2 specific properties with certain CSS2.1 specific values into a single stylesheet?

The stylesheet would not be valid CSS 2 and it would not be valid CSS 2.1.
As long as you know what you are doing, validation is not really important.

A single "real" error listed with dozens of "ignorable" errors is much harder to spot than if it were the only error keeping your documents from being valid.
That QA step goes from "check the validator says it's okay" to "read and evaluate every error the validator gives you". It takes much more time and is much easier to make mistakes.
Posted by Jim Dabell at 8:06PM
What? Please explain this difference. If you write valid code, then it must validate or else you have written valid code.

I'm talking about the W3C Validator sucking. It doesn't understand some of the stuff in the specs, therefore, if you use that valid code it won't validate.
Posted by Turnip at 6:42AM
In order to write markup that conforms to the standards, it must validate, as it is a conformance constraint.

Absolutely agreed to Lachlan.
Anne, why don't you commit to the application/xhtml+xml MIME type here (as you correctly disadvise text/xml, but instead refer to application/xml), since XHTML requires application/xhtml+xml?
Use of a valid (XHTML) DOCTYPE and the correct MIME type definitely is a step requiring valid markup (assuming that the parser correctly checks for well-formedness etc), though you might get off with some self-inventend elements and attributes. Validation is necessary to ensure standards compliance and interoperability, so I wouldn't advise against it.
Posted by Jens Meiert at 6:13PM
I'm talking about the W3C Validator sucking. It doesn't understand some of the stuff in the specs...

I don't think you know what a validator does really. A validator checks that your markup conformas to the rules in the DTD, nothing more, nothing less.
It can't check that everything is right according to the specification. The spec could for example say that you MUST have a paragraph with the words "Hello World!" in each document, to take an extreme example, but it wouldn't be able to check that you really do. If the validator started checking things according to the spec (as it has done with this "fussy parsing" thing), it wouldn't be a validator anymore.
...therefore, if you use that valid code it won't validate.

Huh? Valid code does always validate. The spec has nothing to do with it. But please show me an example where following the spec results in an invalid document.
Posted by David Håsäther at 6:17PM
Jim, that explanation of what happens with the stylesheet (not being valid at all, perhaps only when validated against a particular non-normative CSS3 profile) is exactly my point. About validation being helpful, agreed. But I have found syntax errors or spelling mistakes easy to discover so far.
Jens, since when am I advocating XHTML? I even mentioned using HTML for the user interface, not XHTML. That implies of course, that the MIME type must be text/html and can not be application/xhtml+xml.
Posted by Anne at 8:05PM
David, I'm referring to the CSS validator. Since there is no DTD for CSS (as it isn't markup), the validator cannot do checks according to that. Certain valid (as in, conforming to the specs) CSS 2 and CSS 3 stuff brings up warnings and errors on the CSS validator.
Posted by Turnip at 9:14PM
Jens, since when am I advocating XHTML? I even mentioned using HTML for the user interface, not XHTML.

Well, you mention HTML in the third entry, but IMO it's quite misleading that you talk about that text/xml is evil and ask to use text/html and application/xml in the next entry (if you advert HTML in either case).
Personally, I never met somebody who started to deliver HTML as text/xml or even application/xml, so why do you point at it? (You're of course right, but to disadvise use of application/vnd.liberty-request+xml for HTML is correct, too, but barely helpful.)
Next, why do you recommend to deliver HTML as application/xml...!? I cannot find this part in any specification, and you even confirmed that you meant HTML.
(Don't get me wrong, I just want to illustrate why I assumed that you referred to XHTML in respect to MIME type considerations before.)

That implies of course, that the MIME type must be text/html and can not be application/xhtml+xml.

Of course.
Posted by Jens Meiert at 5:14PM
Jens, that point was about MIME types. It is separate from the markup languages point, obviously. The text/xml sentence is also separated (by a dot) from the next sentence which lists the two MIME types you probably need. (XML for syndication or some XML API and HTML for the user interface.
I think you draw the wrong conclusions from some of my points.
Posted by Anne at 5:41PM
I think you draw the wrong conclusions from some of my points.

No hard feelings.
Posted by Jens Meiert at 6:09PM
Could you please explain what you mean by:
HTML for the user interface, generic XML or XHTML for the backend

If my CMS (e.g., Wordpress or blogger.com) generates XHTML, is that the user interface or the backend? If the latter, how do I get that to the user as HTML? You are serving this page as XHMTL. Isn't this page the user interface? I am quite comfortable with all of the terms in that list item except XSLT, but I don't get your point. Thanks.
Posted by Doug W at 12:17AM
Yeah, I am doing it incorrectly myself. This site will be using HTML in the future.
Posted by Anne at 1:22AM
Anne, I've since read some earlier posts on your site about HTML vs. XHTML, and I think I see where you're coming from. But I want to repeat one of my questions. If my CMS generates valid XHTML, what's the best way to go about sending HTML to the user? Should I expect Wordpress or whatever to generate the HTML for me, or should there be some intermediary program that transforms my documents, or what? Where does the back end end and the front end begin?
Posted by Doug W at 3:01AM