Anne van Kesteren

xsd:language

Sometimes you want to be able to override the inherited language using the empty string value. You would use something like xml:lang="". Today I’m working on this project for university writing some RELAX NG using the XML Schema xsd:language datatype. No more. I now use language = xsd:string { pattern = "([a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*)?" } for this stuff.

Comments

  1. I think that’s still an oversimplification and a proper custom datatype is needed.

    Posted by Henri Sivonen at

  2. I see your version does not allow the empty string value though. Would this imply you have to describe xml:lang as attribute xml:lang { language | "" } for example instead of attribute xml:lang { language }? What is the reason you picked that approach? I agree by the way that a more strict datatype for languages is probably better. I actually thought xsd:language was until I checked it.

    Posted by Anne at

  3. I see your version does not allow the empty string value though. Would this imply you have to describe xml:lang as attribute xml:lang { language | "" } for example instead of attribute xml:lang { language }?

    Yes.

    What is the reason you picked that approach?

    I picked this approach for consistency with the datetime types. The empty string is not really a datetime, but sometimes you may want to allow it anyway. It is easy to say “| ""”, so leaving empty string handling to the schema language layer is not onerous to the schema writer.

    Moreover, as with dates, strictly speaking, the empty string is not a conforming RFC 3066bis language tag. Allowing the empty string is an XML-level or HTML5-level matter.

    I agree by the way that a more strict datatype for languages is probably better. I actually thought xsd:language was until I checked it.

    It is not unusual to think that the XSD datatypes are what you want until you take a closer look. :-(

    Posted by Henri Sivonen at

  4. I use to strongly recommend using xsd:token instead of xsd:string (see http://books.xmlschemata.org/relaxng/relax-CHP-7-SECT-4.html)...

    Eric

    Posted by Eric van der Vlist at

  5. Now using attribute xml:lang { xsd:language | "" } based on feedback. Not optimal, but probably better than what I had.

    Posted by Anne van Kesteren at