I want to do something for al those people who want to do markup & style the correct way, that's why I'm starting this new chapter on my weblog, learning. Learning will cover all the basics towards a better/backward/forward compatible site. Everything explained is based on XHTML1.0, which only slightly differs from HTML4.01. The lessons are for everyone, who wants to have better search results and a more maintainable site. You site looks will remain the same. The only thing we do is add some extra attributes and elements and replace some with CSS, while everything remains working well. Because you can always improve markup and you don't necessarily have to start with a tableless site (actually, starting with a table
based design will give you more insight of how powerful good markup and style is.
(You don't know anything about (X)HTML? Start here: XHTML tutorial from the scratch or here: Getting started with HTML.)
Let's start with the first tag in you document, the <html>
tag. Probably it just looks like this, with no additional attributes. What do you think if we change it to this:
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" id="website-extension">
Not everything is being made smaller you know ;). The first attribute is probably the hardest one of the all, but essential for letting a browsing device know that you use XHTML. If you just use HTML you can leave it out. Then we have the lang
and xml:lang
attributes. The first is for backward compatibility with older browsers (and the only one available for HTML) the second, xml:lang
is there for forward compatibility. The latest attribute is the id
attribute, which you probably already know. We have it there, so that a user can make a specific style sheet for your site, we call that accessibility ;). The language attributes are also very important for accessibility and Google. This way a screen reader will know what the primary language is for you web document and Google can index your site under a specific language, which will improve the search results.
The <html>
tag is always followed by the <head>
tag. Within the head
element, you can specify a lot of other elements. For now, the most important is the title
element. Every document should have one (not more and not less). This element is also important to Google and should contain the website's/company name and a short description of the content of that page. It is recommend that you don't specify the following element and content within the head
element: <meta http-equiv="content-type" content="text/html;charset=iso-8859-1" />
. This should be done through a server-side language, like PHP or should be set in the webserver. In PHP you could put this at the top of your documents:
<?php $charset = "iso-8859-15" $mime_type = "text/html" header("content-type:$mime_type;charset=$charset"); ?>
You should change the values of the variable $charset
to the encoding you are using. If you want to send XHTML as real XML, you can read send application/xhtml+xml on how to do that in PHP. You should know that sending XHTML 1.0 as text/html
is perfectly valid, but it is recommended that you send it as application/xhtml+xml to browsers that support it.
If you can't use any server-side language and you don't have access to you webserver, it is required that you use the meta
element or a PI to specify your character encoding. The meta
element which contains the character encoding should be the first element within the head
element. A PI should appear at the top of the document (before the html
element).
Example with PHP:
<?php $charset = "iso-8859-15" $mime_type = "text/html" header("content-type:$mime_type;charset=$charset"); ?> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" id="website-extension"> <head> <title>Example page with PHP</title> </head> <body> <h1>Example page with <abbr>PHP</abbr></h1> </body> </html>
Example with the meta
element:
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" id="website-extension"> <head> <meta http-equiv="content-type" content="text/html;charset=iso-8859-1" /> <title>Example page with PHP</title> </head> <body> <h1>Example page with <abbr>PHP</abbr></h1> </body> </html>
Example page with the Processing instruction:
<?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" id="website-extension"> <head> <title>Example page with PHP</title> </head> <body> <h1>Example page with <abbr>PHP</abbr></h1> </body> </html>
Additional resources:
Sorry, you can't have an id
attribute in the html
element. It sounds crazy, but that's the way it is. You'll just have to put it in the body
tag instead (which is not entirely satisfactory, since we might need to style the html
element as well in XHTML).
If you don't believe me, try it with a validator.
There are ways to make it possible: ID
on root element
BTW it is allowed in XHTML1.0 (second edition) and that's the main specification, this tutorial(s) are about.
which is not entirely satisfactory, since we might need to style the We have to style the html
element as well in XHTMLhtml
element in XHTML, but that topic was a long time ago.
For the sake of forward compatibility towards XHTML 1.1 it may be better to avoid the complex issue of id
on the root element, as this is a tutorial. Styling the head
element is advanced stuff anyway, for which I personally haven't found any use yet (although I can imagine it being used for debugging purposes). So id
on the body
element will do.
A much more serious problem is the omission of the DOCTYPE
declaration here. You will not have a valid XHTML document without one! It is not the namespace attribute which informs the browser that we are using XHTML, but the DOCTYPE
declaration. I suppose the next tutorial will be about this issue…
I would also drop the lang
attribute. I really think xml:lang
will do. If you are concerned about backwards compatibility I would prefer using a meta
tag for this purpose.
Ben,
XHTML1.1 is based on the modularization which should be updated IMHO in order to include the id
attribute. It is also not meant to style the head
element, but the html
element. Otherwise you can't specify the background-color properly. XHTML == XML, remember?
You are 'also' wrong about the DOCTYPE
. Including such thing can make big differences on the current design. This tutorial is only based on proper markup, not to make a completely valid site. That will come later, but it can be a big disadvantage for newcomers to learn how to handle DOCTYPE
s.
Since xml:lang
isn't even supported in Mozilla, I include the lang
attribute for compatibility. That's also a reason why I don't talk anybody into application/xhtml+xml
.
Proper markup is most important from my point of view. The next part will handle the basics of style sheets, so I can eliminate some rubbish attributes (maybe even elements) in the third.
About the namespace: That is exactly the attribute which tells the browser what kink of markup language we use. Try this:
xmlns
attribute with the appropriate value into one of the documents.
I tried what you said in #6. If I save a test document as .xhtml
my server will send it as application/xhtml+xml
, causing it to be processed as XML by Mozilla Firebird. Without DOCTYPE
declaration it chokes on a character entity, with and without the xmlns
attribute. This tells me the namespace is not enough to tell the browser we are using
XHTML. We do really need the DOCTYPE
declaration. It is important to tell this to beginners as well!
But you are right about styling html
. We do need this, so we should be able to use id
on the html
element.
The meta element which contains the character encoding should be the first element within the head element.
Any specific reason why it should be the first element?
The
META
declaration must only be used when the character encoding is organized such that ASCII-valued bytes stand for ASCII characters (at least until theMETA
element is parsed).META
declarations should appear as early as possible in theHEAD
element.
This can be found within the first resource I specified.
Ben,
Now try it with a DOCTYPE
, without the xmlns
attribute. It is not about entities, although they are a part of the specification. It is how the h1
element is treated for example and how the document will be handled if there is now style applied to it.
Thanks, didn't know that.