xsd:languageSometimes you want to be able to override the inherited language using the empty string value. You would use something like xml:lang="". Today I’m working on this project for university writing some RELAX NG using the XML Schema xsd:language datatype. No more. I now use language = xsd:string { pattern = "([a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*)?" } for this stuff.
I think that’s still an oversimplification and a proper custom datatype is needed.
I see your version does not allow the empty string value though. Would this imply you have to describe xml:lang as attribute xml:lang { language | "" } for example instead of attribute xml:lang { language }? What is the reason you picked that approach? I agree by the way that a more strict datatype for languages is probably better. I actually thought xsd:language was until I checked it.
I see your version does not allow the empty string value though. Would this imply you have to describe
xml:langasattribute xml:lang { language | "" }for example instead ofattribute xml:lang { language }?
Yes.
What is the reason you picked that approach?
I picked this approach for consistency with the datetime types. The empty string is not really a datetime, but sometimes you may want to allow it anyway. It is easy to say “| ""”, so leaving empty string handling to the schema language layer is not onerous to the schema writer.
Moreover, as with dates, strictly speaking, the empty string is not a conforming RFC 3066bis language tag. Allowing the empty string is an XML-level or HTML5-level matter.
I agree by the way that a more strict datatype for languages is probably better. I actually thought
xsd:languagewas until I checked it.
It is not unusual to think that the XSD datatypes are what you want until you take a closer look. :-(
I use to strongly recommend using xsd:token instead of xsd:string (see http://books.xmlschemata.org/relaxng/relax-CHP-7-SECT-4.html)...
Eric
Now using attribute xml:lang { xsd:language | "" } based on feedback. Not optimal, but probably better than what I had.