by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)


DTD Compatibility Datatypes

DTD compatibility is both a library that checks the lexical spaces of its ID, IDREF, and IDREFS datatypes and a more expansive feature. This library adds to the normal RELAX NG processing and enforces DTD-like rules on the schema and on the instance document. This package is designed to facilitate the transition from DTDs to RELAX NG by emulating the attribute types ID, IDREF, and IDREFS. The DTD compatibility feature checks whether ID values are unique within a document and that IDREF and IDREFS are references or whitespace-separated lists of references to ID values defined in the document. It also checks the schema itself to ensure that datatypes are used only in attributes. Unlike their W3C XML Schema counterpart, these datatypes have no facets.

That's pretty much all you have to know about this library. Let's use it straightaway to define the id attributes in our library:

<element xmlns="http://relaxng.org/ns/structure/1.0" name="library"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
  <oneOrMore>
   <element name="book">
    <attribute name="id">
     <data datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"
                      type="ID"/>
    </attribute>
    <attribute name="available">
     <data type="boolean"/>
    </attribute>
    <element name="isbn">
     <data type="NMTOKEN">
       <param name="pattern">[0-9]{9}[0-9x]</param>
     </data>
    </element>
    <element name="title">
     <attribute name="xml:lang">
      <data type="language">
       <param name="length">2</param>
      </data>
     </attribute>
     <data type="token">
       <param name="maxLength">255</param>
     </data>
    </element>
    <zeroOrMore>
     <element name="author">
      <attribute name="id">
       <data datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"
                       type="ID"/>
      </attribute>
      <element name="name">
       <data type="token">
        <param name="maxLength">255</param>
       </data>
      </element>
      <element name="born">
       <data type="date">
        <param name="minInclusive">1900-01-01</param>
        <param name="maxInclusive">2099-12-31</param>
        <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param>
       </data>
      </element>
      <optional>
       <element name="died">
        <data type="date">
         <param name="minInclusive">1900-01-01</param>
         <param name="maxInclusive">2099-12-31</param>
         <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param>
        </data>
       </element>
      </optional>
     </element>
    </zeroOrMore>
    <zeroOrMore>
     <element name="character">
      <attribute name="id">
       <data datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"
                  type="ID"/>
      </attribute>
      <element name="name">
       <data type="token">
        <param name="maxLength">255</param>
       </data>
      </element>
      <element name="born">
       <data type="date">
        <param name="minInclusive">1900-01-01</param>
        <param name="maxInclusive">2099-12-31</param>
        <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param>
       </data>
      </element>
      <element name="qualification">
       <data type="token">
        <param name="maxLength">255</param>
       </data>
      </element>
     </element>
    </zeroOrMore>
   </element>
  </oneOrMore>
 </element>

or:

 datatypes dtd="http://relaxng.org/ns/compatibility/datatypes/1.0"
 element library {
  element book {
   attribute id {dtd:ID},
   attribute available {xsd:boolean "true"},
   element isbn {xsd:NMTOKEN {pattern = "[0-9]{9}[0-9x]"}},
   element title {
     attribute xml:lang {xsd:language {length="2"}},
     xsd:token {maxLength="255"}
   },
   element author {
    attribute id {dtd:ID},
    element name {xsd:token {maxLength = "255"}},
    element born {xsd:date {
      minInclusive = "1900-01-01"
      maxInclusive = "2099-12-31"
      pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}"
    }},
    element died {xsd:date {
      minInclusive = "1900-01-01"
      maxInclusive = "2099-12-31"
      pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}"
    }}?}*,
   element character {
    attribute id {dtd:ID},
    element name {xsd:token {maxLength = "255"}},
    element born {xsd:date {
      minInclusive = "1900-01-01"
      maxInclusive = "2099-12-31"
      pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}"
    }},
    element qualification {xsd:token {maxLength = "255"}}}*
  } +
 }

As already mentioned, the DTD compatibility feature has been designed to provide compatibility with the features of the DTD, and that includes emulating some of their restrictions. I have already mentioned that these datatypes can be used only in attributes, not in elements. I need to mention another limitation that can be more insidious and has bitten renowned experts trying to do things such as write RELAX NG schemas for XHTML.

This rule might be called the "consistent attribute definition rule." Because a DTD won't allow you to give two different definitions of the content of an element, RELAX NG enforces the rule that if an attribute id is defined as ID, IDREF, or IDREFS in an element somewhere in a RELAX NG schema, all the definitions of the same attribute under the same element must use the same type.

The simplest schemas, which don't meet this standard and thus aren't correct with respect to the DTD compatibility feature, are schemas that contain multiple declarations of the same element and attributes with different types, such as:

<?xml version="1.0" encoding="UTF-8"?>
 <element name="foo" xmlns="http://relaxng.org/ns/structure/1.0"
   datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0">
   <element name="bar">
     <attribute name="id">
       <data type="ID"/>
     </attribute>
   </element>
   <zeroOrMore>
     <element name="bar">
       <attribute name="id">
         <data type="token" datatypeLibrary=""/>
       </attribute>
     </element>
   </zeroOrMore>
 </element>

or:

 datatypes dtd="http://relaxng.org/ns/compatibility/datatypes/1.0"
    
 element foo {
    element bar {
      attribute id { dtd:ID }
    },
    element bar {
      attribute id { token }
    } *
  }

Here, there are two definitions of bar with id attributes having competing types. Because one of these types is a dtd:ID type, this practice is forbidden.

A situation tougher to detect and tougher to fix is when one of these competing definitions uses patterns that allow name classes to permit the inclusion of any element, such as you will see in Chapter 12. This situation can create serious complications.


This text is released under the Free Software Foundation GFDL.