Anne van Kesteren

Microsoft on standards

11 November 2004

For a moment there you thought this would be the start of another rant. Fortunately, Microsoft isn't always stupid.

Arthur L., the guy who didn't show up but had a good reason, gave me a link today to the MSN Search (Beta) site. It's funny that the site uses XHTML, something that Internet Explorer doesn't get, so I assume the site is made by some external company or some new guy who doesn't quite get it. However, when you validate the thing you will notice there is only a single mistake and you will also notice that the single mistake won't do any harm, since it doesn't make the document ill-formed. (Of course, that's just theory, the site is actually invalid HTML.)

Other good news is that Microsoft uses UTF-8, a character encoding that is so good it should be the only one available. (That would also save the world from a lot of trouble.) When you hit "search" and validate again the validator returns us 4 errors. The two that are really bad are about the ID attribute duplication. Fortunately for Microsoft they are not declared on the same element, but in my humble opinion such errors should make a document ill-formed too. Maybe that's something for XML 2.0?

Comments

Requiring a processor to detect duplicate IDs makes the memory requirements of the processor dependent on the size of the input document. Even streaming processors have to remember every ID that they see. That's probably inappropriate.
That's why the xml:id specification says the processors SHOULD detect this error, not MUST.
Posted by Norman Walsh at 8:59AM
It's really good to see that they're very close to validation. Hopefully by the the final release, they will have fixed those few errors, but it's already one thing they beat google on.
The HTTP headers still need fixing. They're setting the charset via the meta element, but at least they've got a Content-Type: text/html in there (even though, as you know, it should be application/xhtml+xml).
However, the headers for the CSS are missing the Content-Type, so it wouldn't validate. When I validated via the text area, it only gave 2 errors about the use of cursor: hand;, obviously to handle some old IE implementation, but they've got cursor: pointer; there for standards compliant implimentations also, so it's not too bad.
Posted by Lachlan Hunt at 10:17AM
I had to rub my eyes and look again the first time I viewed source. What's that? A DOCTYPE! Regardless of the XHTML vs HTML issues, it's still a Strict DOCTYPE. And there is only a single validation error. Amazing.
They need to have a look at it in Safari though - their futile attempts at styling the buttons make them look like crap.
Posted by Roger Johansson at 3:48PM
WOW! This is the greatest step in the right direction I've ever seen Microsoft take. Perhaps any big company, in any area. Too bad that they are fsxking XHTML 1.1 up with Visual Studio 2005, though, by having «XHTML 1.1 Strict» as a validation option, which gives an XHTML 1.1 DOCTYPE, but sends out all documents as text/html.
Well, well. They have to start somewhere. And this is a very good start indeed. A big congrats is in place, I guess. Wonder when Google will follow up and finally start to write valid code.
Posted by Asbjørn Ulsberg at 5:49AM
I noticed that the help texts are still the same as on the old search engine, only a little extended at some places. The following text remained the same:
Use only well-formed HTML code in your pages. Ensure that all tags are closed, and that all links function properly. If your site contains broken links, MSNBot may not be able to index your site effectively, and people may not be able to reach all of your pages.
Now I don't have to burst out in loud laughter reading that text. :)
Posted by Frenzie at 5:45PM
…some new guy who doesn't quite get it…

Maybe they do get it and just have a different opinion on the matter.
Posted by Kris at 4:49PM
I assume the site is made by some external company or some new guy who doesn't quite get it.

Or maybe, there's someone in Microsoft who believes that bugs in a single package needn't mean ignorance of major technology.
Posted by David House at 3:23AM