Anne van Kesteren

Links for XML

2 February 2005

As most people know the only way to create a link in a normal XML document is by using XLink. Of course, when you are using some kind of language that is defined using XML alternatives may exist. In XHTML you can use the A element together with the HREF attribute. According to my example you can not simply use xhtml:href in a document. The A element is really needed. That restriction is probably removed once XHTML 2 is no longer a working draft. (Obviously, once that happens you need to use another namespace than http://www.w3.org/1999/xhtml.)

Also, using xhtml:a does not really have big advantage over using the two attributes you need in XLink: xlink:href and xlink:type. Especially since it is more likely that a random piece of XML software will support XLink and not XHTML. Another argument for XLink is that xlink:type might be omitted in future specifications and the default value will be simple in such cases. Norman Walsh writes about these possible changes to XLink. (These changes were also documented by the W3C a couple of days before his write-up in an informative NOTE on Extending XLink 1.0.)

Besides XLink and using XHTML there are some other proposed ways to create links between XML documents. Micah Dubinko of the XForms specification has once written a “specification” for xml:href and xml:src. I personally think that would be the best choice for XML linking. Such general attributes are really cool and easy to work with. The xml:id working draft is already last call; xml:lang and xml:base are already widely used in XML formats. (An obvious example is Atom.) I do not see there is something that could hold back the “SkunkLink” proposal. Some time ago I did propose an additional section for that specification by the way. I thought an xml:cite would be nice to have. I hope someone can tell me why the W3C never introduced that (the “SkunkLink” proposal) although I believe there was some vague reason that XML documents are sometimes not used on the web and the attributes would not be useful in such cases.

The last method for linking in XML documents I’m going to discuss are the Opera CSS linking extensions. Although they do not work in the XHTML namespace (in practice, not in specification) they are useful for linking XML documents together. However, the problem is they create some difficulties with the current way CSS works. (See also the CLink proposal and CSS and hyperlinks.)

Through Tim Bray I also found Linking XML documents which also covers this subject.

Comments

Wait, cite is already vague in HTML. So what would the purpose of xml:cite be exactly?
Posted by Sjoerd Visscher at 6:28PM
It would point to the resource the textual information in the element it is applied to comes from.
Posted by Anne at 6:38PM
...some vague reason that XML documents are sometimes not used on the web.... You really don't seem to have a clue how and where XML is being used. Anyway: the W3C is so right about not even considering xml:cite, because it really belongs to a XML Hypertext Markup Languages, such as XHTML. Don't you see that xml:cite is semantic, instead of functional? XML itself is and must not be about semantics in any way.
Posted by Robbert Broersma at 7:09PM
I thought XML IS about semantics and nothing else. (No presentation (XSL, CSS), no functionality (probably XBL bindigs, ECMAScript, or user-agent's own ways of providing functionality).) Well, man learns for his whole life...
Posted by Spike411 at 8:19PM
I was mainly talking about xml:href and xml:src actually. Which are pretty functional and not really semantic.
Posted by Anne at 8:19PM
Furthermore, I really wonder what would be wrong with xml:cite. xml:id and xml:lang add document semantics as well.
Posted by Anne at 8:29PM
xml:lang is universally applicable and application-independent, and is really of a completely different scale than any other type of meta-data. Thus it is more than reasonable to have that in the xml-prefix namespace. xml:src sounds like a complete rubbish idea to me, but let's just leave that aside. I think that the introduction xml:href would signal that XML is meant to be a 'connected data format', and that would definitely be a bad signal, not only because it would fastly lead to unwanted complexity.
Posted by Robbert Broersma at 9:27PM
After reading over "CSS and hyperlinks", I think the fundamental problem is that we want to have canned behaviors that we can style onto elements using a stylesheet. I think XBL handles a good portion of this (although some extra DOM interfaces might help), but what do you do about user agents that don't support XBL? Is perfect separation of semantics, presentation and behavior actually possible, and if not, where do we draw the line?
Posted by Matthew Raymond at 9:55PM
Robbert, you did not respond to xml:id. Also, what about xml:base? That is also an attribute that assumes some kind of links are being used inside the document.
Matthew, I think you can require both scripting and support for markup. Presentation should always be optional.
Posted by Anne at 10:08PM
Sometimes I think the W3C overthinks itself. If I remember right, very early draft(s?) of Xlink were exactly this kind of thing - xml:link (or was it actually href?) xml:src and so on. I really liked that and never understood why anything more complex was designed.
I vaguely recall an XML linking spec proposed at the W3C, that caused a commotion back in 2002 or very early '03. It had some unique concepts in it, and made a bit more sense (to me) than Xlink.
Posted by Devon at 1:04AM
Hlink! That's what I was thinking of. I can see some use for this, but it definately needs some work still.
Posted by Devon at 1:14AM
I'm not sure wether it is an great Idea to introduce application oriented, semantically fixed elements or attributes to xml in general. I think it's a question of the point of view:
... many discussion around xml, xhtml and other standarts like rss or svg are about the question "what ist the real semantical level of a document an what is only a applicational/presentational level." If you work with XML a lot an tranform it with XSLT to different outputformats, you tend to think, that HTML / XHTML is just some kind of Hypertextfrontend, and the structure of these documents isn't really interesting. The interesting thing is XML and almost anything you need could be done, with DTDs and XSLT. From that point of view you don't need any xml:href.
If you do a lot XHTML HTML stuff, you tend to think, that XML is something like the next evolutionary level of the internet document architecture. And the Internet is mainly a Hypertextapplication. From that point of view, xml:href is something very usefull....
@anne: it's still a great pleasure to read your weblog. and it getting even better, now that xml-topics are here...
Posted by ben at 5:52AM
What’s wrong with XLink (aside from the type attribute being required, as you say)? Why re-invent the wheel? Is it that terribly much effort to just go and declare a namespace for XLink? I can perfectly well understand a reasoning like ‘XLink already implements this, so there is no reason to do this again in XML as well’.
Also, I agree with Robbert that saying such a thing as although I believe there was some vague reason that XML documents are sometimes not used on the web is really a bit too quick about it. Judging by that line, I’m glad the W3C thinks things through better than you seem to do, no offense. There are so much occasions where XML isn’t used for document markup. Think of application settings, configuration files, etc. Defining such things in the XML spec means that parsers would have to support it in some way, too, you know... While it only makes sense for on-screen display. And judging from the size of the XLink spec, it isn’t a trivial thing, if you want to do it right.
Additionally, what if you want to specify a link as a text node inside an element instead of an attribute? Would you then have to duplicate the link in the attribute to make it actually work? In other words, xml:href would be far from a generic solution. No, I think links are much better handled by either the language itself (e.g. XHTML), or a seperate dedicated method like XLink.
Okay, I know too little on the subject to know how this works with xml:lang :), but I could imagine that through the DOM you could read out an element’s language. In any case, it applies to many more cases I can think of, including configuration files e.d. Language is a more fundamental part of documents than hyperlinks are, imho. There is a line to be drawn somewhere, or else XML will turn into a markup language, something it is not and must not be.
As far as you did not respond to xml:id. Also, what about xml:base is concerned: for xml:id it is obvious, the DOM specifies an getElementById function so it is a valueable addition, because without you can only use IDs if there’s a DTD specifically telling which attributes are IDs (assuming the parser is actually processing the DTD).
As for xml:base, it is not actually part of the XML specification, but a seperate (optional) module: This document describes a mechanism for providing base URI services to XLink, but as a modular specification so that other XML applications benefiting from additional control over relative URIs but not built upon XLink can also make use of it. (from the XML Base spec). However I indeed don’t really see the why in putting it in the XML specification. Anyways, thoroughly reading the XML Base spec might provide some answers :).
Anyways, concluding, XLink does this, I don’t see why such a well-thought-out wheel needs to be re-invented. Visual renderers of XML data can implement this to provide the users with the desired functionality (Mozilla already implements part of XLink).
~Grauw
Posted by Laurens Holst at 7:11AM
Note by the way that even though xlink:type is required, xlink:href isn’t :)...: It is not an error for a simple-type element to have no locator (href) attribute value. If a value is not provided, the link is simply untraversable. Such a link may still be useful, for example, to associate properties with the resource by means of XLink attributes.
Also note that the xlink:type attribute isn’t required to actually be on the element... By adding the following to the document’s DTD:
```
xlink:type (simple) #FIXED "simple"
```
(or its Schema equivalent) the type attribute can (theoretically) be skipped and you can use XLink like this:
```
<studentlink xlink:href="...">Pat Jones</studentlink>
```
This is actually copied right out of an example in the XLink specification. Maybe that would take away that particular gripe against XLink?
~Grauw
Posted by Laurens Holst at 7:30AM
But no matter what people say about the usefulness of Xlink, I keep ending up asking myself — why do I need a whole 'nother namespace for something so basic and widespread, especially since it seems so logical in the xml:* namespace?
I just don't get why there's an extra level of complexity added. Anyone ever hear of KISS?
Posted by Devon at 2:01PM
Devon: but isn't putting it in another namespace then the document's (http://www.w3.org/XML/1998/namespace instead of http://www.w3.org/1999/xhtml or http://purl.org/atom/ns#, etcetera) exactly what you propose? Do y'all realise the following? XML isn't only used on the surfable web (Atom, RSS, SVG, XHTML, etc.)! It is being used to exchange invoice information, store aeroplane manuals, track packages, and so on. These document type standards already use their own elements and attributes for URI data purposes if there was need to, and some others use XLink.
Whatever. *alert* *alert* No offence intended: *alert* *alert*
Just think of this: in the hypothetical situation that I can prove that in 15,6% of the XML documents pricing information is stored, and in 3,43% of the documents URI data, will you want to introduce xml:currency and xml:price? If not: shut up, right now. ;-)
Posted by Robbert Broersma at 7:16PM
Devon, is it really such a big problem that you have to put one single additional xmlns on the root of your document? I don’t find that single thing, which is such a basic concept in XML, a really good argument against using XLink.
And Robbert has a good point. Having an objection against the xmlns, you might as well object against the xlink: (or xml:) prefix as well. But of course, the namespaces are the whole fundaments of why XML is extensible. If you really want short links, just create a document format which implements them (like XHTML). Making them function as actual links in your visual display agent can be done with languages like XBL, if I’m not mistaken.
~Grauw
Posted by Laurens Holst at 8:47PM
Laurens, I know how you can avoid to declare the attribute. I do not think it is that useful and simple should be implied from the beginning. (Or FIXED, in DTD terms.)
Also, if xml:base can get a separate specification and is a so-called module. What would be wrong with a module for xml:href and xml:src?
Posted by Anne at 12:11AM
I agree that it would have been nice if when skipping a xlink:type attribute it would have defaulted to simple (though a situation when both the xlink:href and the xlink:type are optional might be a tiny bit weird). Maybe in a future edition of the spec. As for xml:base, I don’t really see the great use in it, and I don’t think it should have been included in the XML namespace at all.
Anyways, my opinion with regard to xml:href and xml:src: this functionality already exists, and whether xml:base exists in the XML namespace or not doesn’t really change that :).
The need for having a general hyperlink mechanism has been recognised, a specification has been developed, and that specification chose to use its own namespace, and to look at the concept of linking from a broader perspective. Creating a secondary specification with the same functionality will only create confusion, and less portability between XML display clients where one implements XLink and the other xml:href and xml:src.
~Grauw
Posted by Laurens Holst at 6:35AM
I think CSS linking is absolutely brilliant, and we would do ourselves a major disservice not to have that capability. As I see it the role of CSS linking has been neatly described by the XLink definition of hyperlink. The domain of CSS would be hyperlinks by this definition, the domain of markup languages would be links.
It is true that we at Opera have let our linking extensions slide. We would much prefer to have something standardised than proprietary extensions. The need for -o-replace has also been alleviated by our support of content: url() for elements, extending that to accept an attribute might do the rest.
Even before that linking extensions are valuable as proofs of concepts, and I agree: The greatest shortcoming with them right now is that we have excluded HTML attributes like cite, longdesc, src, and href.
Posted by Jonny Axelsson at 7:10AM
Laurens, so you think XHTML 2.0 should adopt XLink and drop several linking features it has now?
Jonny, I really like the Opera CSS linking extensions as well. Did you actually exclude those attribute or does Opera just not support it in the XHTML namespace? The former sounds a bit weird.
Posted by Anne at 5:41PM
Anne, no, not at all! :). As I said um, earlier (I think, lemme look it up)... Ah:
No, I think links are much better handled by either the language itself (e.g. XHTML), or a separate dedicated method like XLink.

If you really want short links, just create a document format which implements them (like XHTML).

I guess I didn’t convey my point here strongly enough :). Basically I think hyperlinks are the domain of a markup language, just like em and table and unordered lists. XLink is a nice generalised extension, in case you can’t control the UA enough to convert certain parts of your document into hyperlinks (XBL comes to mind), but on the other hand it poses a restriction on how links are placed in your document.
I think in e.g. an XML storage of browser bookmarks, it would be preferred to have it in an element <url> just like there would be a <title>, instead of having to place the hyperlink as an xlink:href somewhere (kinda inconsistent). Same can be said of one of the many RSS formats. Links are basically stored in many ways, and also presented in several ways, and it’s usually up to an XML markup format itself to determine that. For the exception cases, there is XLink.
By the way, where would you want to use XLink for? Surely you’re not thinking of publishing marked up & styled XML documents? Might as well use spans and divs for everything then, because to an outsider every custom XML format is pure riddles. That’s why standards like XHTML exist.
Regarding XHTML2: I really think it’s all good :). I was marking up a document for print with CSS recently (yeah, Prince, blabla, they rule ;p), and I mean a real document as in a technical documentation document. And I dearly missed tags like blockcode, section, l and the ability to place links directly on lis. I actually used divs for the sections (more limited than I would have used section though), and created a tag in my own namespace for l, discovering that even Mozilla apparantly still implements the ‘old’ \: namespacing method in CSS (instead of a |)which never made it out of working draft.
~Grauw
p.s. your comment script eats my backslash in the \: above, I have to duplicate it for it to stay...
Posted by Laurens Holst at 7:51PM
Laurens, yeah, WordPress sucks with that. However, Mozilla has support for CSS 3 namespaces which are part of the Selectors specification which is a bit further than just working draft. You do need to declare @namespace and such of course. (See also CSS namespaces for more information.)
Posted by Anne at 7:57PM
Anne, hmm... Last time I tried that, it didn’t, work, I had to use the old syntax!
Maybe I didn’t do it correctly (although Prince matched it)...
~Grauw
Posted by Laurens Holst at 6:36PM
Ah, Anne, I think the reason as to why it didn’t work was a combination of me parsing the document where I used this in HTML mode (darn .html extension), and not having the @namespace rule at the beginning of the stylesheet. I should read the specs better :).
Anyways, I just used the namespace selectors again... This time I added an g:sort attribute to a table head, in my own namespace, which a JavaScript hooks in to make the table sortable. Check it out, it’s at this page. I’m using the namespaced CSS to style the visual feedback. Of course, it doesn’t work in IE, but neither does the sorting, heh :). Opera also doesn’t get it, so for that browser you’re without visual feedback (well, actually I do a style.cursor="pointer" in the JavaScript :)).
Although now the page doesn’t validate against the DTD anymore, of course. Guess I should fix that. Wasn’t there some article about that on ALA? Ah, there is... Hmpf, I hate DTDs, they’re so clueless about namespaces and remove all flexibility...
Posted by Laurens Holst at 5:14AM
(Regarding Opera, its XML and namespaces support has only matured sufficiently in its most recent 8.0 beta for the script to work at all, before that it was utter dysfunctional crap. It would be nice if they implemented CSS3 namespace selectors as well though.)
Posted by Laurens Holst at 5:53AM