Sometimes HTML is not enough and you need a mechanism to include some custom data into the document. In 2005 Validating a Custom DTD was published on A List Apart and illustrated how you could add custom attributes to HTML and have them validate. The problem with the approach outlined in that article is that DTDs are a thing from the past and that only the W3C validator cares about them. (Newer validators, such as Validator.nu, don’t have this issue.) Browsers ignore DTDs and the only reason they still look at the doctype of a page is to determine the rendering mode. So if you add a custom DTD and add a
required attribute to the browser that will look as if you added a
required attribute to HTML. Now if a future version of HTML introduces an attribute with the same name, but with different semantics, your page might behave slightly weird in future browsers.
There is a custom data proposal for HTML5 that allows authors to add custom data to their pages without interfering with future extensions to HTML. The idea is that all attributes starting with
data- are reserved for Web authors and they can do whatever they like with them. (They are not intended for browser extensions, et cetera.) In addition there will be a DOM attribute
dataset that will allow easier access to these attributes. For an attribute
data-opacity you can access that using
dataset.opacity instead of having to use
Namespaces were considered, but integrating them in the existing HTML environment is harder and they would also make it harder to author.
DTDs are a thing from the past and that only the W3C validator cares about them
I'm not sure I fully agree with this statement. I still see DTDs frequently used in XML documents; they're primary use lies in the ability to define named character entities. The original HTML4 specification defined its named character entities using DTD catalogs. If what you say is true for HTML5, this means that named character entities are become a "part" of the language itself, and they aren't coming from DTDs anymore. Is this the case?
Yes. (It has been true for HTML as a language practiced on the Web for quite a while now so making that more formal makes sense.)
Wouldn’t it be better to just ask the user to prefix their custom attributes, like CSS does?
I did that recently in a control I created, I needed to store a databound ID on various parts of the HTML inside, which I did using a
btl_data-id attribute (
btl being the prefix here). Seems better than only allowing the
data- prefix, which seems unnecessarily restrictive and have a greater potential of conflicts when running several controls from different frameworks/authors alongsite.
Or, maybe adding accessor methods for RDFa to the HTML DOM would provide the desired functionality (just a wild thought that I didn’t think very carefully about!).
Laurens, I think you'd right that as
data-btl-id if you wanted to follow the
Laurens, that would also force usage of the underscore. On top of that it becomes more difficult for authors to figure out what they have to do. As for RDF, that would be an order of magnitude more complicated and go far beyond what is actually needed as solution here.
That sounds like an excellent idea and a solid implementation!
x- instead of the verbose (and not always appropriately named)
data-? The corresponding DOM object could be called
extattr instead of
dataset, which too is less inaccurate in various cases.
I'm liking X/HTML 5 more and more all the time. I didn't like it in the beginning.
François, it is mostly for ease of authoring and encouraging authors to use these attributes instead of attributes that might clash with future versions of HTML.
Aristotle, the problem with that is that the attributes are not experimental, which is what
x- is often used for. They are simply representing custom data.