RELAX NG by Eric van der Vlist will be published by O'Reilly & Associates (ISBN: 0596004214)

You are welcome to use our annotation system to give your feedback.


Which library should we use?

All the RELAX NG implementations must support the native datatype library; many of them also support the DTD compatibility datatypes library and the W3C XML Schema datatypes library. That means that if we want to define a token or string datatype we will often have a choice between the native library and W3C XML Schema datatypes and if we are defining ID, IDREF or IDREFS we will often have a choice between the DTD compatibility library and W3C XML Schema datatypes.

When you need to define a datatype covered by both DTD and W3C XML Schema, i.e. ID, IDREF or IDREFS, a similar rule of thumb applies. If you are using the DTD compatibility library your schema should be slightly more portable but you will lose the facets.

The other factor to take into account is that the rules applied if you are using the DTD compatibility feature are strict and consistent over different implementations, while if you are using the W3C XML Schema type library, a processor should apply these same rules if and only if it also supports the DTD datatype library. Processors which only support W3C XML Schema datatypes are only supposed to check the lexical space of these datatypes.

In practice, that means that you can use ID, IDREF or IDREFS datatypes from the W3C XML Schema library but that it is safer to debug your schema using an implementation which supports both the DTD and the W3C XML Schema type libraries.

If you design a RELAX NG schema using W3C XML Schema's ID, IDREF and IDREFS, and then test it with an implementation which supports only W3C XML Schema datatypes, the rules of DTD compatibility will not be enforced. When you use the same schema and instance documents with a RELAX NG processor supporting both the DTD and W3C XML Schema datatypes, you get tighter control; the instance documents and even the schema which were previously valid may suddenly become invalid or incorrect because of this control.

A simple example of schema which is correct for RELAX NG implementations supporting W3C XML Schema datatypes without supporting the DTD compatibility layer, yet doesn't meet the DTD compatibility feature for RELAX NG implementations supporting both is this schema defining ID elements:

 <?xml version="1.0" encoding="UTF-8"?>
 <element name="foo" xmlns="http://relaxng.org/ns/structure/1.0"
  datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
   <zeroOrMore>
     <element name="bar">
       <element name="id">
         <data type="ID"/>
       </element>
     </element>
   </zeroOrMore>
 </element>

or:

 element foo {
   element bar {
     element id { xsd:ID }
   } *
 }

Other examples include schemas which do not respect the rule by which the definitions of attributes holding these datatypes must be consistent throughout the schema.

The reason for this behavior is that although I have often been speaking of "DTD compatibility datatype library" for clarity all over this chapter, DTD compatibility is more than a datatype library. Per the RELAX NG formal specification, a datatype library must be decoupled from the validation of the structure of the document and the context passed to the datatype library is restricted to the namespace declarations available under the node being validated. This context itself is an exception required to process qualified names. The datatype library has thus not enough information to do the tests requires to support DTD compatibility: it doesn't even know if the data to validate has been found in an element or an attribute. This aspect of the DTD compatibility is thus a feature and not a datatype library as defined per RELAX NG.

When we use a datatype from the datatype library http://relaxng.org/ns/compatibility/datatypes/1.0, we are actually doing two different things:

Applied to the W3C XML Schema datatype library, this translates as: trigger the ID DTD compatibility feature when available if these datatypes are used.


You are welcome to use our annotation system to give your feedback.
[Annotations for this page]
All text is copyright Eric van der Vlist, Dyomedea. During development, I give permission for non-commercial copying for educational and review purposes. After publication, all text will be released under the Free Software Foundation GFDL.