by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)


Examplotron: Instance Documents as Schemas

I created Examplotron from a very simple idea: when you want to describe the element foo, why work in yet another language, writing:

<element name='foo'><empty/></element>

or:

element foo {empty}

It's so much simpler to just write the element in plain XML: <foo/>. Instead of describing instance documents, why couldn't you just show them?

The first implementation, published with the original description of Examplotron, relied on two XSLT transformations. The Examplotron "schema" was compiled by an XSLT transformation into another XSLT transformation, which then performed the validation of the instance documents. The concept received many positive comments when I announced it, but it was very limited. Adding new features would have meant creating the full semantics of a new schema language. The implementation as an XSLT transformation became very complex and the project was stalled until I realized the potential of using RELAX NG as a target format instead.

Since the release of version 0.5, Examplotron has been implemented as an XSLT transformation that creates a RELAX NG schema. Thanks to this approach, Examplotron made more progress in two weeks than in two years under the previous architecture!

[Tip]Tip

For more information on Examplotron, and to get the tools used for the transformations in this section, visit http://examplotron.org.

Ten-Minute Guide to Examplotron

Here's a snippet of our example document:

 <?xml version="1.0" encoding="utf-8"?>
 <character id="Snoopy">
   <name>Snoopy</name>
   <born>1950-10-04</born>
   <qualification>extroverted beagle</qualification>
 </character>

Without requiring any further work, this document is already an Examplotron schema. To get an idea of what this schema means, we can translate it into a RELAX NG schema:

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar xmlns="http://relaxng.org/ns/structure/1.0"
  xmlns:ega="http://examplotron.org/annotations/"
  xmlns:sch="http://www.ascc.net/xml/schematron"
  datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
    <start>
       <element name="character">
          <optional>
             <attribute name="id">
                <ega:example id="Snoopy"/>
             </attribute>
          </optional>
          <element name="name">
             <text>
                <ega:example>Snoopy</ega:example>
             </text>
          </element>
          <element name="born">
             <data type="date">
                <ega:example>1950-10-04</ega:example>
             </data>
          </element>
          <element name="qualification">
             <text>
                <ega:example>extroverted beagle</ega:example>
             </text>
          </element>
       </element>
    </start>
 </grammar>

or:

 namespace ega = "http://examplotron.org/annotations/"
 namespace sch = "http://www.ascc.net/xml/schematron"
      
 start =
   element character {
     [ ega:example [ id = "Snoopy" ] ] attribute id { text }?,
     element name { [ ega:example [ "Snoopy" ] ] text },
     element born { [ ega:example [ "1950-10-04" ] ] xsd:date },
     element qualification {
       [ ega:example [ "extroverted beagle" ] ] text
     }
   }

You can see that the Examplotron schema has the same modeling power as its RELAX NG counterpart. The annotations that appear here need to be added to the RELAX NG schema if we don't want to lose the "examples" provided in Examplotron. The examples are included because they are useful for documentation purposes and to permit reverse transformations (from RELAX NG to Examplotron).

Another thing to note in this example is that Examplotron is making inferences from what it found in the schema. Here, Examplotron assumed that the order between name, born, and qualification is significant; that these elements are mandatory; that the id attribute is optional; that the born element has a type (xsd:date); and that all the other elements and attributes are just text. These assumptions make a best effort to capture the likely intention of the designer of the document. Most of the time, people won't have to do anything to tweak their Examplotron schema.

There are times, however, when Examplotron gets it wrong. However good Examplotron may be, it can't be psychic: if you want to create schemas different than the default inferences of Examplotron, you need to request those things explicitly. The way to request them is through annotating the Examplotron schema. To make the qualification element optional, for example, add an eg:occurs attribute with a value of ?. To give the id attribute a type dtd:ID, set its content to {dtd:id}:

 <?xml version="1.0" encoding="utf-8"?>
 <character id="{dtd:id}" xmlns:eg="http://examplotron.org/0/">
   <name>Snoopy</name>
   <born>1950-10-04</born>
   <qualification eg:occurs="?">extroverted beagle</qualification>
 </character>

Here's the example translated into RELAX NG:

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar xmlns="http://relaxng.org/ns/structure/1.0"
  xmlns:ega="http://examplotron.org/annotations/"
  xmlns:sch="http://www.ascc.net/xml/schematron"
  datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
    <start>
       <element name="character">
          <optional>
             <attribute name="id">
                <data type="id" 
                datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"/>
             </attribute>
          </optional>
          <element name="name">
             <text>
                <ega:example>Snoopy</ega:example>
             </text>
          </element>
          <element name="born">
             <data type="date">
                <ega:example>1950-10-04</ega:example>
             </data>
          </element>
          <optional>
             <element name="qualification">
                <text>
                   <ega:example>extroverted beagle</ega:example>
                </text>
             </element>
          </optional>
       </element>
    </start>
 </grammar>

or:

 namespace ega = "http://examplotron.org/annotations/"
 namespace sch = "http://www.ascc.net/xml/schematron"
      
 datatypes d = "http://relaxng.org/ns/compatibility/datatypes/1.0"
      
 start =
   element character {
     attribute id { d:id }?,
     element name { [ ega:example [ "Snoopy" ] ] text },
     element born { [ ega:example [ "1950-10-04" ] ] xsd:date },
     element qualification {
       [ ega:example [ "extroverted beagle" ] ] text
     }?
   }

If you compare the compact syntax and the Examplotron schema, you will see that we have something that is similarly concise. The compact syntax looks more formal, while Examplotron is easier to explore at a glance. Nevertheless, according to the rules described in the documentation of Examplotron, these two schemas are equivalent. This equivalence makes it possible to transform Examplotron to RELAX NG and back.

We can go pretty far with these annotations, as shown in this more complete example, which uses interleave, mandatory attributes, and complex elements defined as named patterns:

 <?xml version="1.0" encoding="utf-8"?>
 <library xmlns:eg="http://examplotron.org/0/" 
         eg:content="eg:interleave" eg:define="library-content">
   <book available="true" eg:occurs="*" eg:define="book-content">
     <eg:attribute name="id" eg:content="dtd:id">b0836217462</eg:attribute>
     <isbn>0836217462</isbn>
     <title xml:lang="en">Being a Dog Is a Full-Time Job</title>
     <author eg:occurs="+" eg:define="author-content" eg:content="eg:interleave">
       <eg:attribute name="id" eg:content="dtd:id">CMS</eg:attribute>
       <name>Charles M Schulz</name>
       <born>1922-11-26</born>
       <died>2000-02-12</died>
     </author>
     <character eg:define="character-content" eg:content="eg:interleave">
       <eg:attribute name="id" eg:content="dtd:id">PP</eg:attribute>
       <name>Peppermint Patty</name>
       <born>1966-08-22</born>
       <qualification>bold, brash and tomboyish</qualification>
     </character>
     <character id="Snoopy">
       <name>Snoopy</name>
       <born>1950-10-04</born>
       <qualification>extroverted beagle</qualification>
     </character>
     <character id="Schroeder">
       <name>Schroeder</name>
       <born>1951-05-30</born>
       <qualification>brought classical music to the Peanuts strip</qualification>
     </character>
     <character id="Lucy">
       <name>Lucy</name>
       <born>1952-03-03</born>
       <qualification>bossy, crabby and selfish</qualification>
     </character>
   </book>
 </library>

The RELAX NG schema generated from this Examplotron schema is:

 <?xml version="1.0" encoding="UTF-8"?>
 <grammar xmlns="http://relaxng.org/ns/structure/1.0" 
          xmlns:ega="http://examplotron.org/annotations/"
          xmlns:sch="http://www.ascc.net/xml/schematron" 
          datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
    <start>
       <element name="library">
          <ref name="library-content" ega:def="true"/>
       </element>
    </start>
    <define name="library-content">
       <interleave>
          <zeroOrMore>
             <element name="book">
                <ref name="book-content" ega:def="true"/>
             </element>
          </zeroOrMore>
       </interleave>
    </define>
    <define name="book-content">
       <optional>
          <attribute name="available">
             <data type="boolean">
                <ega:example available="true"/>
             </data>
          </attribute>
       </optional>
       <attribute name="id">
          <ega:skipped>b0836217462</ega:skipped>
          <data type="id" 
            datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"/>
       </attribute>
       <element name="isbn">
          <data type="integer">
             <ega:example>0836217462</ega:example>
          </data>
       </element>
       <element name="title">
          <optional>
             <attribute name="lang" ns="http://www.w3.org/XML/1998/namespace">
                <ega:example xml:lang="en"/>
             </attribute>
          </optional>
          <text>
             <ega:example>Being a Dog Is a Full-Time Job</ega:example>
          </text>
       </element>
       <oneOrMore>
          <element name="author">
             <ref name="author-content" ega:def="true"/>
          </element>
       </oneOrMore>
       <oneOrMore>
          <element name="character">
             <ref name="character-content" ega:def="true"/>
          </element>
       </oneOrMore>
       <ega:skipped>
          <character xmlns="" xmlns:eg="http://examplotron.org/0/" id="Snoopy">
             <name>Snoopy</name>
             <born>1950-10-04</born>
             <qualification>extroverted beagle</qualification>
          </character>
       </ega:skipped>
       <ega:skipped>
          <character xmlns="" xmlns:eg="http://examplotron.org/0/" id="Schroeder">
             <name>Schroeder</name>
             <born>1951-05-30</born>
             <qualification>brought classical music to the Peanuts strip
             </qualification>
          </character>
       </ega:skipped>
       <ega:skipped>
          <character xmlns="" xmlns:eg="http://examplotron.org/0/" id="Lucy">
             <name>Lucy</name>
             <born>1952-03-03</born>
             <qualification>bossy, crabby and selfish</qualification>
          </character>
       </ega:skipped>
    </define>
    <define name="author-content">
       <interleave>
          <attribute name="id">
             <ega:skipped>CMS</ega:skipped>
             <data type="id" 
               datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"/>
          </attribute>
          <element name="name">
             <text>
                <ega:example>Charles M Schulz</ega:example>
             </text>
          </element>
          <element name="born">
             <data type="date">
                <ega:example>1922-11-26</ega:example>
             </data>
          </element>
          <element name="died">
             <data type="date">
                <ega:example>2000-02-12</ega:example>
             </data>
          </element>
       </interleave>
    </define>
    <define name="character-content">
       <interleave>
          <attribute name="id">
             <ega:skipped>PP</ega:skipped>
             <data type="id" 
               datatypeLibrary="http://relaxng.org/ns/compatibility/datatypes/1.0"/>
          </attribute>
          <element name="name">
             <text>
                <ega:example>Peppermint Patty</ega:example>
             </text>
          </element>
          <element name="born">
             <data type="date">
                <ega:example>1966-08-22</ega:example>
             </data>
          </element>
          <element name="qualification">
             <text>
                <ega:example>bold, brash and tomboyish</ega:example>
             </text>
          </element>
       </interleave>
    </define>
 </grammar>

or, in the compact syntax, and skipping some annotations for readability:

 namespace eg = "http://examplotron.org/0/"
 namespace ega = "http://examplotron.org/annotations/"
 namespace sch = "http://www.ascc.net/xml/schematron"
      
 datatypes d = "http://relaxng.org/ns/compatibility/datatypes/1.0"
      
 start = element library { [ ega:def = "true" ] library-content }
 library-content = element book { [ ega:def = "true" ] book-content }*
 book-content =
   attribute available {
     [ ega:example [ available = "true" ] ] xsd:boolean
   }?,
   [ ega:skipped [ "b0836217462" ] ] attribute id { d:id },
   element isbn { [ ega:example [ "0836217462" ] ] xsd:integer },
   element title {
     [ ega:example [ xml:lang = "en" ] ] attribute lang { text }?,
     [ ega:example [ "Being a Dog Is a Full-Time Job" ] ] text
   },
   element author { [ ega:def = "true" ] author-content }+,
   (element character { [ ega:def = "true" ] character-content }+)
 author-content =
   [ ega:skipped [ "CMS" ] ] attribute id { d:id }
   & element name { [ ega:example [ "Charles M Schulz" ] ] text }
   & element born { [ ega:example [ "1922-11-26" ] ] xsd:date }
   & element died { [ ega:example [ "2000-02-12" ] ] xsd:date }
 character-content =
   [ ega:skipped [ "PP" ] ] attribute id { d:id }
   & element name { [ ega:example [ "Peppermint Patty" ] ] text }
   & element born { [ ega:example [ "1966-08-22" ] ] xsd:date }
   & element qualification {
       [ ega:example [ "bold, brash and tomboyish" ] ] text
     }

For those who would like even more flexibility, the next version of Examplotron will "import" all the RELAX NG patterns in the Examplotron namespace, so that Examplotron schemas can use RELAX NG compositors, patterns, and name classes where needed.

Use Cases

Why would anyone want to use Examplotron instead of RELAX NG? I could reverse the question and ask why anyone would want to use RELAX NG instead of Examplotron. At the end of the day, it doesn't really matter. What's important is that the semantics of the validation engine are rock solid and have no limitations. Developers can use the most convenient syntax to express schemas, and what's convenient varies among developers. If you like the visual quality of Examplotron, there is no reason to use anything else. If you prefer RELAX NG's more formal style, that's fine too. With Examplotron, you are just looking at a RELAX NG schema from a different angle.


This text is released under the Free Software Foundation GFDL.