Anne van Kesteren

Basic principles of RDF

19 May 2006

At XTech I learned some basic principles of RDF after talking about it briefly with Steven Pemberton. Basically you have some vocabularies defined with RDF in mind or they are just called RDF vocabularies (I believe the latter). These vocabularies consist of a namespace and one or more (or perhaps zero or more) properties. XHTML would not be such a vocabulary and FOAF is an example of one.

The other thing about RDF is triples. “Mark is an XForms fanboy” and “Mark is speaking at XTech” are both examples of triples. The subject node in the first example is “Mark” and the predicate is “is” I guess and the object node is “XForms fanboy” or the other way around, but it doesn’t really matter anyway. The thing is that you can derive other triples from those triples like “XForms fanboy’s where speaking at XTech.” This is all pretty cool and you can actually base some applications on it for single files and text/html-namespace-hackery (crazy, crazy, crazy).

That last bit gets me to the next point. It won’t work. Unless there is some way to integrate it with HTML the idea won’t fly. Besides that, it smees like RDF is fairly complex for people who have trouble understanding HTML. I guess microformats fit into this picture in some way, but they are different from the whole RDF thing.

There is another common theme about RDF by the way: The XML serialization sucks. The above was mostly based on some proposal called RDFa that was presented during XTech. The talk didn’t address this concept “Compact URIs” though…

Comments

RDFa conveniently integrates into XHTML2. Compact URIs are just namespace prefixes for attribute values.
As for it not flying... well, I guess we'll see. It's technically sound, but has more than its fair share of detractors. HTML's problems and misunderstandings have largely been a result of its perception as a layout language; I don't see analogous problems for RDF. It's really no harder to understand than the relational data model (and is a lot simpler in many ways).
Posted by Brendan Taylor at 6:59AM
In your first example, I would have said that the predicate was “is an x fanboy”, which lets you use the same predicate to express the relationship between someone else and UML or Star Trek, for example.
From personal experience, RDF is much more understandable if you start from the graph or Dave Beckett’s Turtle than the standard XML serialisation. If fewer people came across RDF via its XML serialisation it might be more popular.
Posted by Carey Evans at 11:50AM
Also, take a look at this presentation and search for "RDF". By the way, I did a thesis about the Semantic Web, you can download it over here.
Posted by Chris Eidhof at 6:43PM
RDFa is microformats done right.
Posted by Sjoerd Visscher at 2:14AM
Microformats are an easy and clean way to add meta-data. While RDFa has a bigger impact on the initial XHTML-file. With an Architects eye I like the idea of RDFa, everything nicely fit into a namespace. But as a web-designer I think it's nice that you can add some nice pieces of meta-data without having to worry about things like namespaces.
The future is bright, and microformats are a nice way to get one step closer to that future. RDF will probably be that future one day, but not just yet...
Posted by me at 4:47PM
People tend to think ‘RDF is difficult’ (I used to), but it really isn’t. RDF/XML is indeed a big culprit here. In any case, RDF/A will make RDF annotations accessible to everyone, in exactly the same way as Microformats (except that it’s using different attributes). So statements like:
Besides that, it smees like RDF is fairly complex for people who have trouble understanding HTML. I guess microformats fit into this picture in some way, but they are different from the whole RDF thing.

and:
The future is bright, and microformats are a nice way to get one step closer to that future. RDF will probably be that future one day, but not just yet...

are a bit off; Microformats are very very similar to RDF, they make semantic statements about something, and RDF does the same thing. One of the advantages of RDF however is that it’s generic and does not depend on some general ‘agreement’ on an ontology (vocabulary) although in practice that will often happen frequently (such as with FOAF). Another that is often-mentioned is that it can be ‘reasoned’ about, but for the average user that’s a fairly fuzzy and uninteresting topic.
The basis of RDF is indeed making simple statements, which consist of three parts: Subject, Predicate, Object, e.g. ‘Klaas, buys, a Volvo’. Each of these are identified uniquely with an URL, for example http://example.org/foaf.rdf#Klaas, http://example.net/actions-ontology#buys, http://example.com/carbrands#Volvo.
Or, expressed more friendly with semi-QNames —CURIEs— in RDF/A: myfoaf:Klaas, ao:buys, cars:Volvo (with the appropriate namespace bindings done at the top).
So Klaas is uniquely identified here with http://example.org/foaf.rdf#Klaas, and as that URI references a FOAF profile you can also find all kinds of other information there, or use an RDF search engine to see where else this specific Klaas (as opposed to all the other Klaas’s out there) is mentioned.
I think that pretty much summarises the important parts of RDF. There’s more stuff like you can also make statements about statements, couple things together in various ways, and consider things on higher levels such as that Klaas is a person, a person is also a human (?), and any human is a mammal, etcetera, but that’s not really important for most people.
So please don’t regard RDF as something difficult, it’s not so complex if you’re reusing existing ontologies, and RDF/A only makes it easier.
Something really nice is RDF databases; in relational databases, there’s always architectural decisions of what data you put in which table, e.g. when you have an employees table, do you put the money he earns in the same table or do you only want to have personal info in that table, and his earnings somewhere else. With RDF, you just make the statement ‘employee 1234 earns €1500’, and be done with it. You can query anything you want (if the data is there), and make one to many or many to many relationships without problems.
With regard to:
Basically you have some vocabularies defined with RDF in mind or they are just called RDF vocabularies (I believe the latter).

They’re called ontologies. At the Utrecht University they have some course about RDF and such, I believe it’s called Advanced Databases, I can recommend it.
Unless there is some way to integrate it with HTML the idea won’t fly.

I’d say for now, this is sufficient. How to extract RDF/A from HTML (even though not specified) seems pretty clear, and it can be addressed in a future specification (if not from the W3C, maybe from the Microformats people?). Rome wasn’t built in one day :).
Posted by Laurens Holst at 8:59PM
Ah, the course was Advanced Database Systems. It’s only given once every two years though, so be sure not to miss it next year :).
~Grauw
Posted by Laurens Holst at 9:09PM
How could RDF/A be something of the present? The spec itself talks about XHTML 2, while:
- XHTML2 is noty even a recommendation yet
- There are no browser supports it
Posted by me at 10:15PM
A profile of XHTML 1.1 is in the making, and according to Mark Birbeck I could use it today (so I suppose I will :)).
Browser support is not necessary, after all, how many browsers support microformats (and is that stopping you from using them)?
I think it would be good if the whole microformats movement would move to RDF/A syntax as soon as possible, to get this through into common use as soon as possible.
~Grauw
Posted by Laurens Holst at 11:46PM
You don't need anything to support microformats, the class and rel attributes are well-supported. And what are microformats more than a few class attributes and sometimes a rel attribute?
Posted by me at 1:51PM
me, but RDF/A is ‘more semantic’ and certainly more flexible and general-purpose than microformats are.
Why stick to what we have now if there’s something better out there?
By the way, RDF/A also re-uses the existing attributes (e.g. the rel attribute) as much as possible :). But contrary to Microformats it doesn’t confuse ‘class’ and ‘title’ for something else, so hence the new ‘property’ and ‘content’ attributes.
~Grauw
Posted by Laurens Holst at 3:20PM
Well, I guess there are several reasons:
- RDF is hard;
- RDF doesn’t work with HTML;
- Microformats do.
Introducing new attributes and such don’t make things easier and introducing namespaces doesn’t make things easier at all for your average copy & paste author.
Posted by Anne at 3:46PM
That's indeed about it Anne.
I have to deliver solutions to customers now, which means a combination of HTML pages and XHTML 1.0 pages, and RDF/a does not fit into that picture right now... but rest assured I'll be using RDF as soon as possible! As a mather of fact, I'm using RDF already in combination with XSLT to generate a sitemap, but never implemented it in a real life situation.
Posted by me at 5:18PM
Anne,
RDF is hard;

Well, I don’t agree with that, if Microformats are easy to understand then so is RDF. There are an equal amount of attributes involved, with pretty much the same values. ‘title’ becomes ‘content’, ‘class’ becomes ‘property’, ‘profile’ becomes ‘xmlns’ and that’s about it. All the average user needs is documentation which tells you where to put which attributes and which values.
RDF doesn’t work with HTML;

Nonsense, there is no technical reason why it wouldn’t. If you need to be explained explicitly how this would have to be done, yes it would be great if some group specified explicit rules for this (WHATWG?), but it seems pretty straightforward anyway. Anyways, once RDF/A is used frequently enough in HTML context this will be standardised soon enough.
and introducing namespaces doesn’t make things easier at all for your average copy & paste author.

I don’t see how, Microformats already use the ‘profile’ attribute, which is in principle exactly the same as XML namespaces (and without it, microformats markup is pretty meaningless), there is nothing more complicated in RDF/A.
Anyways, ‘me’, I suppose that RDF/A is still a very new technology so I can understand that you’re still a bit hesitant to use it *right now*. On the other hand, why not :). It’s only a couple of attributes here and there.
The only way for technologies to become successful is to have them being used, and there is nothing wrong with some early adoption. Especially not given that if RDF/A isn’t ready to be used now, then when is it? RDF/A doesn’t need browser support (seeing how quickly things like bookmarklets and Greasemonkey scripts are developed for microformats), so you don’t have to wait for that.
Ok, let me partially answer my own question: it would be nice if there were at least a Recommendation and an XHTML 1.1 profile :). But is that really important, and not just some technical details, it will become a Recommendation at some point and I don’t think it will change significantly, and it’s not as if it wouldn’t work without DTD, those are pretty irrelevant nowadays anyway.
What strikes me is that some people are generally distrusting and critical to the extreme towards new W3C standards, but as soon as other bodies like the WHATWG specify something all is great and fine.
~Grauw
Posted by Laurens Holst at 2:03AM
Fantastic to see a good discussion of RDFa (yes, we're taking the "/" out of the name) and microformats.
Microformats are great because they forced the semantic web crowd to think about HTML as a way to convey semantic information. That's a big accomplishment. At the same time, MFs are very un-web-like. They're not opportunistic. I can't take a couple of fields from one MF, merge them with another MF, and publish my own metadata. The web should allow that. RDFa allows that.
RDFa is about loosely coupled metadata. The loose coupling is the property that made the web take off, it's the property that creates the most interoperability with the least amount of effort and structure. The publisher gets to decide what metadata to publish. The tools pick up as much metadata as they understand.
By the way, we have a proposal for transforming MFs into RDFa: hGRDDL. This means that you can build tools that are generic, RDFa capable, and still parse microformats.
You can also keep track of all RDFa news at rdfa.creativecommons.org (the hostname is temporarily with CC, but in the long run it will be independent.)
Posted by Ben Adida at 8:22PM
I think http://www.rdfa.info/ is the new, prettier URL?
Posted by Laurens Holst at 10:38AM
RDF isn't designed as a language for encoding pages to render them. It is designed as a language for encoding information. If you want to put hypertext onto the web HTML does an adequate job already. If you want a data storage format for a large set of data with complex interrelations, that you can extend either in terms of the kind of information or by adding in a large amount of information from someone else (I am thinking something on the order of several million statements), and then automate a complex query over it, HTML is not especially good.
This is why my.opera.com has a SPARQL query service - it's an efficient way to make arbitrary queries.
Integrating into HTML isn't necessary for it to succeed. It isn't meant to replace HTML in any of the things that HTML is any good for. It is, of course, possible to use GRDDL or similar appraoches (piggybanks' data extractors, for example) to make it authorable in HTML and there are a whole collection of tools to render output as HTML.
And although it isn't trivial to understand a classification system that is scaleable so the whole world can use it before they get the rest of the world to agree, RDF is not actually very hard.
There are a few key points, of course. It is additive - and doesn't have true negation. It therefore relies on some social behavioural conventions for optimal function. In particular, being nice about defining things in other people's namespaces.
On the other hand, this power makes it easy to annotate a vocabulary (ontology, classification, set of terms, call it what you will) that was originally described in another language, in a way that makes it simple to extract the documentation, for example to present the HTML version of a query result in the desired language, or to build interfaces that can label themselves dynamically, and it makes it easy to gather together very large collections of only slightly related data in order to do some query.
This is stuff that HTML isn't designed for, nor used for.
Posted by chaals at 10:17PM