RELAX NG by Eric van der Vlist will be published by O'Reilly & Associates (ISBN: 0596004214)

You are welcome to use our annotation system to give your feedback.


Since its conception, RELAX NG has always tried to keep a balance between simplicity of use, simplicity of implementation, and the simplicity of its data model. What's simple to implement is often simple to use, but there are many features which are very effective for the users but add complexity for the implementers and clutter the data model. This is the case, for instance, for all the features designed to create building blocks (named patterns, includes, embedded grammars). They are very helpful to users but the fact that you've used named patterns or a Russian doll style, has zero impact on the validation itself. This is also the case for shortcuts, such as the mixed pattern, which is really just a more concise way of writing an interleave pattern with an embedded text pattern.

The quest for simplicity has had a huge influence on the design of RELAX NG. Here is the view of James Clark on the subject:

"Simplicity of specification often goes hand in hand with simplicity of use. But I find that these are often in conflict with simplicity of implementation. An example would be ambiguity restrictions as in XSD: these make implementation simpler (well, at least for people who don't want to learn a new algorithm) but make specification and use more complex. In general, RELAX NG aims for implementation to be practical and safe (i.e. implementations shouldn't use huge amounts of time/memory for particular schemas/instances), but apart from that favors simplicity of specification/use over simplicity of implementation."

To keep the description of the restriction and validation algorithm simple while continuing to offer valuable features to the users, RELAX NG has chosen to describe validation as a two step process:

The simplification is described for each RELAX NG element in the RELAX NG specification, so we won't dive into its details here--just give the main points. We'll note that simplification removes all syntactic sugar, consolidates all external schemas, uses a subset of all the available RELAX NG elements, and transforms the resulting structure into a flat schema. Each element is embedded in a named pattern and all of the named patterns contain the definition of a single element.

The RELAX NG specification is very clear that this simplification is done by the RELAX NG processors on the data model after reading of the complete schema. The result of this simplification doesn't ever have to be serialized as XML. However, showing intermediary results as XML helps to show what the simplification process does.


Intermediary results are indented for readability. In reality, whitespace is removed in one of the first steps of the simplification.

The XML syntax is more similar to the data model used to describe the simplification than the compact syntax is. The details of the simplification will be shown below in XML snippets. For each sequence of steps I've also given the compact syntax for the whole schema to give a better overall view of the impact on the structure of the schema., although some impacts of simplification are lost when using the compact syntax.

The first step of simplification performs various normalizations without changing the structure of the schema:

After this set of steps, the structure of the schema is still unchanged, but all cosmetic features, which have no impact on the meaning of the schema, have been removed. For instance, the following schema snippet:

 <?xml version="1.0" encoding="utf-8"?>
 <grammar xmlns=""
   <a:documentation>RELAX NG schema for our library</a:documentation>
      <sn:stage name="library"/>
      <sn:stage name="book"/>
      <sn:stage name="author"/>
      <sn:stage name="character"/>
      <sn:stage name="author-or-book"/>
       <element name=" library "  sn:stages="library">
           <ref name="book-element"/>
       <ref name="book-element" sn:stages="book author-or-book"/>
       <ref name="author-element" sn:stages="author author-or-book"/>
       <ref name="character-element" sn:stages="character"/>
   <define name=" author-element ">
     <element name="hr:author" datatypeLibrary="">
       <attribute name="id" datatypeLibrary="">
         <data type="NMTOKEN">
          <param name="maxLength"> 16 </param>
       <ref name=" name-element"/>
       <ref name="born-element"/>
         <ref name="died-element"/>


   <define name="available-content">
       <value type="token"> false </value>
       <value> </value>

will be transformed during this sequence of steps into this (note that I am still showing whitespace for readability even though it would have been removed):

 <?xml version="1.0"?>
 <grammar xmlns=""
       <element name="library">
           <ref name="book-element"/>
       <ref name="book-element"/>
       <ref name="author-element"/>
       <ref name="character-element"/>


   <define name="author-element">
     <element name="hr:author">
       <attribute name="id">
         <data datatypeLibrary="" type="NMTOKEN">
           <param name="maxLength"> 16 </param>
       <ref name="name-element"/>
       <ref name="born-element"/>
         <ref name="died-element"/>


   <define name="available-content">
       <value type="token" datatypeLibrary="">true</value>
       <value datatypeLibrary="" type="token">
false </value>
       <value type="token" datatypeLibrary=""> </value>

After our first series of steps, our schema looks like:

 namespace a = ""
 namespace hr = ""
 namespace local = ""
 default namespace ns1 = ""
 namespace sn = ""
 start =
   element library { book-element+ }
   | book-element
   | author-element
   | character-element
 include "foreign.rnc" {
   foreign-elements = element * - (local:* | ns1:* | hr:*) { anything }*
   foreign-attributes = attribute * - (local:* | ns1:* | hr:*) { text }*
 author-element =
   element hr:author {
     attribute id {
       xsd:NMTOKEN { maxLength = " 16 " }
 include "book-content.rnc"
 book-content &= foreign-nodes
 book-element = element book { book-content }
 born-element = element hr:born { xsd:date }
 character-element = external "character-element.rnc"
 died-element = element hr:died { xsd:date }
 isbn-element = element isbn { foreign-attributes, token }
 name-element = element hr:name { xsd:token }
 qualification-element = element qualification { text }
 title-element = element title { foreign-attributes, text }
 available-content = "true" | xsd:token " false " | " "

The second sequence of steps reads and processes externalRef and include patterns:

After our second step, we obtain a standalone schema without any reference to external documents.

The following snippet:

   <define name="character-element">
     <externalRef href="character-element.rng" ns=""/>

will be transformed into:

   <define name="character-element">
     <grammar ns="">
         <element name="character">
           <attribute name="id"/>
           <parentRef name="name-element"/>
           <parentRef name="born-element"/>
           <parentRef name="qualification-element"/>

And the snippet:

   <include href="foreign.rng">
     <define name="foreign-elements">
               <nsName ns=""/>
               <nsName ns=""/>
               <nsName ns=""/>
           <ref name="anything"/>
     <define name="foreign-attributes">
               <nsName ns=""/>
               <nsName ns=""/>
               <nsName ns=""/>


     <define name="foreign-elements">
               <nsName ns=""/>
               <nsName ns=""/>
               <nsName ns=""/>
           <ref name="anything"/>
     <define name="foreign-attributes">
               <nsName ns=""/>
               <nsName ns=""/>
               <nsName ns=""/>
     <define name="anything">
             <ref name="anything"/>
     <define name="foreign-nodes">
           <ref name="foreign-attributes"/>
           <ref name="foreign-elements"/>

In the compact syntax, the schema after the second sequence of steps looks like:

 namespace a = ""
 namespace hr = ""
 namespace local = ""
 default namespace ns1 = ""
 namespace sn = ""
 start =
   element library { book-element+ }
   | book-element
   | author-element
   | character-element
 div {
   foreign-elements = element * - (local:* | ns1:* | hr:*) { anything }*
   foreign-attributes = attribute * - (local:* | ns1:* | hr:*) { text }*
   anything =
     (element * { anything }
      | attribute * { text }
      | text)*
   foreign-nodes = (foreign-attributes | foreign-elements)*
 author-element =
   element hr:author {
     attribute id {
       xsd:NMTOKEN { maxLength = " 16 " }
 div {
   book-content =
     attribute id { text },
     attribute available { available-content },
 book-content &= foreign-nodes
 book-element = element book { book-content }
 born-element = element hr:born { xsd:date }
 character-element =
   grammar {
     start =
       element character {
         attribute id { text },
         parent name-element,
         parent born-element,
         parent qualification-element
 died-element = element hr:died { xsd:date }
 isbn-element = element isbn { foreign-attributes, token }
 name-element = element hr:name { xsd:token }
 qualification-element = element qualification { text }
 title-element = element title { foreign-attributes, text }
 available-content = "true" | xsd:token " false " | " "

This third sequence of steps performs the normalization of name classes:

By this third sequence of steps, name classes are almost normalized (the except and choice name class will be normalized in the fourth sequence of steps).

During this sequence of steps, the snippet:

     <element name="hr:author">
       <attribute name="id">
         <data datatypeLibrary="" type="NMTOKEN">
           <param name="maxLength"> 16 </param>
       <ref name="name-element"/>
       <ref name="born-element"/>
         <ref name="died-element"/>

is transformed into:

       <name ns="">author</name>
         <name ns="">id</name>
         <data datatypeLibrary="" type="NMTOKEN">
           <param name="maxLength"> 16 </param>
       <ref name="name-element"/>
       <ref name="born-element"/>
         <ref name="died-element"/>

Note that none of these modifications are visible in the compact syntax. The compact syntax already requires that all namespace declarations be made in the declaration section of the schema and supports is no difference between name elements and attributes.

In the fourth sequence of steps, patterns are normalized:

After the fourth set of steps, the number of different types of patterns has been reduced to a set of "primitive" patterns. All the patterns which are left have a fixed number of child elements.

Here's our example snippet:

     <define name="foreign-elements">
               <nsName ns=""/>
               <nsName ns=""/>
               <nsName ns=""/>
           <ref name="anything"/>

is transformed into:

  <define name="foreign-elements">
                  <nsName ns=""/>
                  <nsName ns=""/>
                <nsName ns=""/>
          <ref name="anything"/>

During our fourth set of steps, our schema becomes:

 namespace a = ""
 namespace hr = ""
 namespace local = ""
 default namespace ns1 = ""
 namespace sn = ""
 start =
   ((element library { book-element+ }
     | book-element)
    | author-element)
   | character-element
 foreign-elements =
   element * - ((local:* | ns1:*) | hr:*) { anything }+
   | empty
 foreign-attributes =
   attribute * - ((local:* | ns1:*) | hr:*) { text }+
   | empty
 anything =
   ((element * { anything }
     | attribute * { text })
    | text)+
   | empty
 foreign-nodes = (foreign-attributes | foreign-elements)+ | empty
 author-element =
   element hr:author {
     ((attribute id {
         xsd:NMTOKEN { maxLength = " 16 " }
     (died-element | empty)
 book-content =
   ((((attribute id { text },
       attribute available { available-content }),
    (author-element+ | empty)),
   (character-element+ | empty)
 book-content &= foreign-nodes
 book-element = element book { book-content }
 born-element = element hr:born { xsd:date }
 character-element =
   grammar {
     start =
       element character {
         ((attribute id { text },
           parent name-element),
          parent born-element),
         parent qualification-element
 died-element = element hr:died { xsd:date }
 isbn-element = element isbn { foreign-attributes, token }
 name-element = element hr:name { xsd:token }
 qualification-element = element qualification { text }
 title-element = element title { foreign-attributes, text }
 available-content = ("true" | xsd:token " false ") | " "

It is much more verbose, but has a simpler structure.

define and start elements are combined in each grammar, then the grammars all are mergedinto one top level grammar:

During this fifth sequence of steps, the simplified schema:

  <define name="born-element">
      <name ns="">born</name>
      <data datatypeLibrary="" type="date"/>
  <define name="character-element">
          <name ns="">character</name>
                  <name ns="">id</name>
                <parentRef name="name-element"/>
              <parentRef name="born-element"/>
            <parentRef name="qualification-element"/>


  <define name="born-element-id2613943">
      <name ns="">born</name>
      <data datatypeLibrary="" type="date"/>
  <define name="character-element-id2613924">
      <name ns="">character</name>
              <name ns="">id</name>
            <ref name="name-element-id2613832"/>
          <ref name="born-element-id2613943"/>
        <ref name="qualification-element-id2613840"/>

No specific algorithm to create unique names for named pattern is described in the specification so these names will vary between implementations.

To demonstrate the drastic change which occurs during simplification, we will examine a schema which is a consolidation of features seen throughout this book to cover most of the elements affected by the simplification. It is composed of four documents.

The first, library.rnc (which will be library.rng in the XML syntax), defines the library in general, but not authors or characters:

 namespace a = ""
 namespace hr = ""
 namespace local = ""
 default namespace ns1 = ""
 namespace sn = ""
 a:documentation [ "RELAX NG schema for our library" ]
 sn:stages [[
   sn:stage [ name = "library" ]
   sn:stage [ name = "book" ]
   sn:stage [ name = "author" ]
   sn:stage [ name = "character" ]
   sn:stage [ name = "author-or-book" ]
 start =
   [ sn:stages = "library" ] element library { book-element+ }
   | [ sn:stages = "book author-or-book" ] book-element
   | [ sn:stages = "author author-or-book" ] author-element
   | [ sn:stages = "character" ] character-element
 include "foreign.rnc" {
   foreign-elements = element * - (local:* | ns1:* | hr:*) { anything }*
   foreign-attributes = attribute * - (local:* | ns1:* | hr:*) { text }*
 author-element =
   element hr:author {
     attribute id {
       xsd:NMTOKEN { maxLength = " 16 " }
 include "book-content.rnc"
 book-content &= foreign-nodes
 book-element = element book { book-content }
 born-element = element hr:born { xsd:date }
 character-element = external "character-element.rnc"
 died-element = element hr:died { xsd:date }
 isbn-element = element isbn { foreign-attributes, token }
 name-element = element hr:name { xsd:token }
 qualification-element = element qualification { text }
 title-element = element title { foreign-attributes, text }
 available-content = "true" | xsd:token " false " | " "

The second, book-content.rnc (or bookcontent.rng in the XML syntax), contains a pattern defining the contents of books:

 book-content =
   attribute id { text },
   attribute available { available-content },

The third, character-element.rnc (or character-element.rng in the XML syntax), defines character elements:

 start =
   element character {
     attribute id { text },
     parent name-element,
     parent born-element,
     parent qualification-element

The last component, foreign.rnc (or foreign.rng), provides a model for openness in the schema:

 anything =
   (element * { anything }
    | attribute * { text }
    | text)*
 foreign-elements = element * { anything }*
 foreign-attributes = attribute * { text }*
 foreign-nodes = (foreign-attributes | foreign-elements)*

The complete schema for the library after the grammar merging steps are completed is:

 namespace local = ""
 namespace ns1 = ""
 default namespace ns2 = ""
 start =
   ((element library { book-element-id2613963+ }
     | book-element-id2613963)
    | author-element-id2614058)
   | character-element-id2613924
 foreign-elements-id2614183 =
   element * - ((local:* | ns2:*) | ns1:*) { anything-id2614112 }+
   | empty
 foreign-attributes-id2614152 =
   attribute * - ((local:* | ns2:*) | ns1:*) { text }+
   | empty
 anything-id2614112 =
   ((element * { anything-id2614112 }
     | attribute * { text })
    | text)+
   | empty
 foreign-nodes-id2614043 =
   (foreign-attributes-id2614152 | foreign-elements-id2614183)+ | empty
 author-element-id2614058 =
   element ns1:author {
     ((attribute id {
         xsd:NMTOKEN { maxLength = " 16 " }
     (died-element-id2613856 | empty)
 book-content-id2614016 =
   (((((attribute id { text },
        attribute available { available-content-id2613805 }),
     (author-element-id2614058+ | empty)),
    (character-element-id2613924+ | empty))
   & foreign-nodes-id2614043
 book-element-id2613963 = element book { book-content-id2614016 }
 born-element-id2613943 = element ns1:born { xsd:date }
 character-element-id2613924 =
   element character {
     ((attribute id { text },
 died-element-id2613856 = element ns1:died { xsd:date }
 isbn-element-id2613872 =
   element isbn { foreign-attributes-id2614152, token }
 name-element-id2613832 = element ns1:name { xsd:token }
 qualification-element-id2613840 = element qualification { text }
 title-element-id2613819 =
   element title { foreign-attributes-id2614152, text }
 available-content-id2613805 = ("true" | xsd:token " false ") | " "

The basic style of the schema (Russian doll or named templates) has still been preserved by the previous steps. The goal of our sixth step, schema flattening, is to normalize the use of named templates. Our goal is to make the schema similar in structure to a DTD. Each element is cleanly embedded in its own named pattern and there is no other use of named patterns than embedding a single element:

During this step, the snippet:

            <name ns="">library</name>
              <ref name="book-element-id2613963"/>
          <ref name="book-element-id2613963"/>
        <ref name="author-element-id2614058"/>
      <ref name="character-element-id2613924"/>

is replaced by:

          <ref name="__library-elt-id2615152"/>
          <ref name="book-element-id2613963"/>
        <ref name="author-element-id2614058"/>
      <ref name="character-element-id2613924"/>


  <define name="__library-elt-id2615152">
      <name ns="">library</name>
        <ref name="book-element-id2613963"/>

If we take the results of merging the four-part schema from the previous section and apply this step, we get:

 namespace local = ""
 namespace ns1 = ""
 default namespace ns2 = ""
 start =
   ((__library-elt-id2615152 | book-element-id2613963)
    | author-element-id2614058)
   | character-element-id2613924
 author-element-id2614058 =
   element ns1:author {
     ((attribute id {
         xsd:NMTOKEN { maxLength = " 16 " }
     (died-element-id2613856 | empty)
 book-element-id2613963 =
   element book {
     (((((attribute id { text },
          attribute available { ("true" | xsd:token " false ") | " " }),
       (author-element-id2614058+ | empty)),
      (character-element-id2613924+ | empty))
     & (((attribute * - ((local:* | ns2:*) | ns1:*) { text }+
          | empty)
         | (__-elt-id2615098+ | empty))+
        | empty)
 born-element-id2613943 = element ns1:born { xsd:date }
 character-element-id2613924 =
   element character {
     ((attribute id { text },
 died-element-id2613856 = element ns1:died { xsd:date }
 isbn-element-id2613872 =
   element isbn {
     (attribute * - ((local:* | ns2:*) | ns1:*) { text }+
      | empty),
 name-element-id2613832 = element ns1:name { xsd:token }
 qualification-element-id2613840 = element qualification { text }
 title-element-id2613819 =
   element title {
     (attribute * - ((local:* | ns2:*) | ns1:*) { text }+
      | empty),
 __-elt-id2615020 =
   element * {
       | attribute * { text })
      | text)+
     | empty
 __library-elt-id2615152 = element library { book-element-id2613963+ }
 __-elt-id2615098 =
   element * - ((local:* | ns2:*) | ns1:*) {
       | attribute * { text })
      | text)+
     | empty

The simplification process is almost done and just needs to do a bit of final cleanup:

After this cleanup, our schema becomes:

 namespace local = ""
 namespace ns1 = ""
 default namespace ns2 = ""
 start =
   ((__library-elt-id2615152 | book-element-id2613963)
    | author-element-id2614058)
   | character-element-id2613924
 author-element-id2614058 =
   element ns1:author {
     ((attribute id {
         xsd:NMTOKEN { maxLength = " 16 " }
     (empty | died-element-id2613856)
 book-element-id2613963 =
   element book {
     (((((attribute id { text },
          attribute available { ("true" | xsd:token " false ") | " " }),
       (empty | author-element-id2614058+)),
      (empty | character-element-id2613924+))
     & (empty
        | ((empty
            | attribute * - ((local:* | ns2:*) | ns1:*) { text }+)
           | (empty | __-elt-id2615098+))+)
 born-element-id2613943 = element ns1:born { xsd:date }
 character-element-id2613924 =
   element character {
     ((attribute id { text },
 died-element-id2613856 = element ns1:died { xsd:date }
 isbn-element-id2613872 =
   element isbn {
      | attribute * - ((local:* | ns2:*) | ns1:*) { text }+),
 name-element-id2613832 = element ns1:name { xsd:token }
 qualification-element-id2613840 = element qualification { text }
 title-element-id2613819 =
   element title {
      | attribute * - ((local:* | ns2:*) | ns1:*) { text }+),
 __-elt-id2615020 =
   element * {
     | ((__-elt-id2615020
         | attribute * { text })
        | text)+
 __library-elt-id2615152 = element library { book-element-id2613963+ }
 __-elt-id2615098 =
   element * - ((local:* | ns2:*) | ns1:*) {
     | ((__-elt-id2615020
         | attribute * { text })
        | text)+

You are welcome to use our annotation system to give your feedback.
[Annotations for this page]
All text is copyright Eric van der Vlist, Dyomedea. During development, I give permission for non-commercial copying for educational and review purposes. After publication, all text will be released under the Free Software Foundation GFDL.