by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)

A Real-World Example: XHTML 2.0

Let's leave our library for a while to look at XHTML. XHTML modularization breaks the monolithic XHTML 1.0 DTD into a set of independent modules described as in independent DTDs. Those modules can be combined to create as many flavors of XHTML as people may want. However, this has proven to be one of the most challenging exercises for schema languages. In their Working Drafts, the W3C HTML Working Group, the group in charge of XHTML, has published a set of RELAX NG schemas to describe XHTML 2.0. Its many interconnected modules illustrate the flexibility of RELAX NG to perform this type of exercises.

The solution chosen by XHTML 2.0 (see for more detail) is to define each module in its own schema and then include all these modules in a top-level schema (called the RELAX NG XHTML 2.0 Driver). The driver schema looks like this:

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar ns=""
  <x:h1>RELAX NG schema for XHTML 2.0</x:h1>
    Copyright &#xA9;2003 W3C&#xAE; (MIT, ERCIM, Keio), All Rights Reserved.
      Editor:   Masayasu Ishikawa <>
      Revision: $Id: ch10.xml,v 1.7 2004/01/05 20:47:21 becki Exp $
    Permission to use, copy, modify and distribute this RELAX NG schema
    for XHTML 2.0 and its accompanying documentation for any purpose and
    without fee is hereby granted in perpetuity, provided that the above
    copyright notice and this paragraph appear in all copies. The copyright
    holders make no representation about the suitability of this RELAX NG
    schema for any purpose.
    It is provided "as is" without expressed or implied warranty.
    For details, please refer to the W3C software license at:
      <x:a href=""></x:a>
    <x:h2>XHTML 2.0 modules</x:h2>
    <x:h3>Attribute Collections Module</x:h3>
    <include href="xhtml-attribs-2.rng"/>
    <x:h3>Structure Module</x:h3>
    <include href="xhtml-struct-2.rng"/>
    <x:h3>Block Text Module</x:h3>
    <include href="xhtml-blktext-2.rng"/>
    <x:h3>Inline Text Module</x:h3>
    <include href="xhtml-inltext-2.rng"/>
    <x:h3>Hypertext Module</x:h3>
    <include href="xhtml-hypertext-2.rng"/>
    <x:h3>List Module</x:h3>
    <include href="xhtml-list-2.rng"/>
    <x:h3>Linking Module</x:h3>
    <include href="xhtml-link-2.rng"/>
    <x:h3>Metainformation Module</x:h3>
    <include href="xhtml-meta-2.rng"/>
    <x:h3>Object Module</x:h3>
    <include href="xhtml-object-2.rng"/>
    <x:h3>Scripting Module</x:h3>
    <include href="xhtml-script-2.rng"/>
    <x:h3>Style Attribute Module</x:h3>
    <include href="xhtml-inlstyle-2.rng"/>
    <x:h3>Style Sheet Module</x:h3>
    <include href="xhtml-style-2.rng"/>
    <x:h3>Tables Module</x:h3>
    <include href="xhtml-table-2.rng"/>
    <x:h3>Support Modules</x:h3>
    <x:h4>Datatypes Module</x:h4>
    <include href="xhtml-datatypes-2.rng"/>
    <x:h4>Events Module</x:h4>
    <include href="xhtml-events-2.rng"/>
    <x:h4>Param Module</x:h4>
    <include href="xhtml-param-2.rng"/>
    <x:h4>Caption Module</x:h4>
    <include href="xhtml-caption-2.rng"/>
    <x:h2>XML Events module</x:h2>
    <include href="xml-events-1.rng"/>
    <x:h2>Ruby module</x:h2>
    <include href="full-ruby-1.rng"/>
    <x:h2>XForms module</x:h2>
    <x:p>To-Do: work out integration of XForms</x:p>
    <!--include href="xforms-1.rng"/-->

Don't worry for the moment about the ns attribute (Chapter 11), nor about the foreign (non-RELAX NG) namespaces and the div elements (Chapter 13). One of these modules, the Structure Module, defines the basic structure of a XHTML 2.0 document. For instance, the head element is defined as:

<define name="head">
   <element name="head">
     <ref name="head.attlist"/>
     <ref name="head.content"/>
 <define name="head.attlist">
   <ref name="Common.attrib"/>
 <define name="head.content">
   <ref name="title"/>


head = element head { head.attlist, head.content }
 head.attlist = Common.attrib
 head.content = title

This example shows another design decision. For each element, the XHTML Working Group decided to define a named pattern with the same name as the element (head) and two separated named patterns to define the list of its attributes (head.attlist) and its content (head.content). This design decision makes it easy for other modules to add new elements and attributes just by combining these named patterns by interleave. For instance, the Metainformation Module adds a meta element to the content of the head element by combining via interleave the head.content pattern with zero or more meta elements:

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar xmlns=""
  <x:h1>Metainformation Module</x:h1>
    <x:h2>The meta element</x:h2>
    <define name="meta">
      <element name="meta">
        <ref name="meta.attlist"/>
          <ref name="Inline.model"/>
            <ref name="meta"/>
    <define name="meta.attlist">
      <ref name="Common.attrib"/>
        <attribute name="name">
          <ref name="NMTOKEN.datatype"/>
  <define name="head.content" combine="interleave">
      <ref name="meta"/>


 namespace x = ""
 meta = element meta { meta.attlist, (Inline.model | meta+) }
 meta.attlist =
   attribute name { NMTOKEN.datatype }?
 head.content &= meta*

The fact that the content models are combined by interleave guarantees independence between modules: you can add or remove modules independently of each other. It also guarantees the independence of the resulting schema over the order in which the different modules are included in the top-level schema; you can switch the Metainformation Module and the Scripting Module, which both add content into the head element, without any impact on the set of valid documents.

This modularity fully relies on combinations by interleave and RELAX NG would have no easy solution if you want to add content, for instance, to what has already be defined in the head element. Of course, if you're interested only in the Structure Module and want to add a foo element after the title element, you can redefine head.content:

  <include href="xhtml-struct-2.rng">
    <define name="head.content">
      <ref name="title"/>
      <element name="foo">

However, this doesn't take into account all the content added by the other modules into the head element.

This text is released under the Free Software Foundation GFDL.