by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)
All the RELAX NG implementations must support the native datatype library; many of them also support the DTD compatibility datatypes library and the W3C XML Schema datatypes library. That means that if you want to define a token or string datatype, you can often choose between the native library and W3C XML Schema datatypes, and if you're defining ID, IDREF, or IDREFS, you can often choose between the DTD compatibility library and W3C XML Schema datatypes.
The criteria for choosing between native or W3C XML Schema datatypes to define string and token types is simple: if you need facets, use W3C XML Schema datatypes. If you don't, use native datatypes: your schema will be more portable, because the RELAX NG processors aren't obliged to support the W3C XML Schema type library.
When you need to define a datatype covered by both DTD and W3C XML Schema—i.e., ID, IDREF, or IDREFS—a similar rule of thumb applies. If you use the DTD compatibility library, your schema should be slightly more portable, but you will lose the facets.
The other factor to take into account is that the rules applied when you use the DTD compatibility feature are strict and consistent over different implementations; when you use the W3C XML Schema type library, a processor should apply these same rules if and only if it also supports the DTD datatype library. Processors that support only W3C XML Schema datatypes are supposed to check only the lexical space of these datatypes.
In practice, that means you can use ID, IDREF, or IDREFS datatypes from the W3C XML Schema library, but that it is safer to debug your schema using an implementation that supports both the DTD and the W3C XML Schema type libraries.
If you design a RELAX NG schema using W3C XML Schema's ID, IDREF, and IDREFS, and then test it with an implementation that supports only W3C XML Schema datatypes, the rules of DTD compatibility will not be enforced. When you use the same schema and instance documents with a RELAX NG processor supporting both the DTD and W3C XML Schema datatypes, you get tighter control; the instance documents and even the schema that were previously valid may suddenly become invalid or incorrect because of this control.
Here's a simple example of a schema that defines ID elements. It's correct for RELAX NG implementations that supporting W3C XML Schema datatypes without supporting the DTD compatibility layer, and yet it doesn't use the DTD compatibility feature for RELAX NG implementations supporting both.
<?xml version="1.0" encoding="UTF-8"?> <element name="foo" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <zeroOrMore> <element name="bar"> <element name="id"> <data type="ID"/> </element> </element> </zeroOrMore> </element> |
or:
element foo { element bar { element id { xsd:ID } } * } |
Other examples include schemas that don't respect the rule by which the definitions of attributes holding these datatypes must be consistent throughout the schema.
The reason for this behavior is that although I've often mentioned the "DTD compatibility datatype library" for clarity all over this chapter, DTD compatibility is more than a datatype library. Per the RELAX NG formal specification, a datatype library must be decoupled from the validation of the structure of the document, and the context passed to the datatype library is restricted to the namespace declarations available under the node being validated. This context itself is an exception required to process qualified names. The datatype library has thus not enough information to do the tests required to support DTD compatibility: it doesn't even know whether the data to validate has been found in an element or an attribute. This aspect of the DTD compatibility is thus a feature and not a datatype library as defined per RELAX NG.
When you use a datatype from the datatype library http://relaxng.org/ns/compatibility/datatypes/1.0, you're actually doing two different things:
Using a datatype library that restricts the lexical space of your data and value patterns.
Requesting testing to ensure that the ID are unique, and that the IDREF and IDREFS are referring to IDs and lists of IDs.
Applied to the W3C XML Schema datatype library, this translates as: if these datatypes are used, trigger the ID DTD compatibility feature when available.
This text is released under the Free Software Foundation GFDL.