RELAX NG by Eric van der Vlist will be published by O'Reilly & Associates (ISBN: 0596004214)
You are welcome to use our annotation system to give your feedback.
RELAX NG doesn't define specific elements and attributes reserved for annotations. Instead, RELAX NG opened its language. RELAX NG permits foreign attributes - attributes from any namespace other than the RELAX NG namespace - to appear on all its elements. RELAX NG also allows elements from either no namespace or from any namespace other than the RELAX NG namespace in all its elements with a content model which is empty or element only. (That include all RELAX NG elements except value and param which have a text-only content model.) RELAX NG is thus strictly following the principles of an open schema presented in the last chapter.
In the XML syntax, adding annotations is both easy and flexible. It's a very straightforward process to add annotations using foreign elements. For instance, we'll add some Dublin Core (dc) elements to our grammar to identify its title and author:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:title>RELAX NG flat schema for our library</dc:title> <dc:author>Eric van der Vlist</dc:author> <start> <element name="library"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> .../... </grammar> |
or perhaps we want to add some XHTML documentation:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <xhtml:div> <xhtml:h1>RELAX NG flat schema for our library</xhtml:h1> <xhtml:p>This schema has been written by <xhtml:a href="http://dyomedea.com/vdv">Eric van der Vlist</xhtml:a>.</xhtml:p> </xhtml:div> .../... </grammar> |
or perhaps we want to use XLink through attributes:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> .../... </grammar> |
RELAX NG itself won't know what to do with this extra information - that's up to processors built specifically for handling the annotations - but it will quietly ignore all this extra information, letting you bundle whatever information you like into the schema without disrupting it.
Annotations are much more challenging to use correctly when using the compact syntax. Because it is not XML, the compact syntax has no built-in support for this kind of extensibility; an alternative syntax based on square brackets ( [ and ]) has been developed to embed XML structures within the compact syntax. Unfortunately, the square brackets and XML aren't a delightful mix with the other punctuation used in the compact syntax. The syntax for including annotations within a schema is slightly different according to their location in the schema.
Note | |
---|---|
Annotations using the compact syntax are deceptively simple. Although they seem easy, they are a common source of errors. As a solution, consider translating between the compact and XML syntax using tools such as James Clark's Trang, available at http://www.thaiopensource.com/relaxng/trang.html. You may feel safer , and your code might actually be in safer hands, if you always convert to the XML syntax to edit your annotations. Examining Trang's results is a good way to master the intricacies of the compact syntax annotations as well. |
The easiest annotations to write are for foreign elements in a grammar element. These annotations are called "grammar annotations and do the same work as the first two examples shown above in with the XML syntax. First, the Dublin Core annotations, look like this in RELAX NG's XML syntax:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:title>RELAX NG flat schema for our library</dc:title> <dc:author>Eric van der Vlist</dc:author> <start> <element name="library"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> .../... </grammar> |
For the compact syntax, the namespace-qualified name of the annotation, followed by a left square bracket, its contents, and a right square bracket. The annotated schema above would be written:
namespace dc = "http://purl.org/dc/elements/1.1/" dc:title [ "RELAX NG flat schema for our library" ] dc:author [ "Eric van der Vlist" ] start = element library { book-element+ } |
The use of the qualified name ( dc:title or dc:author) is specific to grammar annotations while the syntax [ element content ] used to represent its content is more generic.
These annotations can have structured content with child elements and attributes. Let's reexamine our XHTML example:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <xhtml:div> <xhtml:h1>RELAX NG flat schema for our library</xhtml:h1> <xhtml:p>This schema has been written by <xhtml:a href="http://dyomedea.com/vdv">Eric van der Vlist</xhtml:a>.</xhtml:p> </xhtml:div> .../... </grammar> |
In the compact syntax, we use and approach similar to that used for the Dublin Core example, but with more square brackets to represent nested element and attribute structures:
namespace xhtml = "http://www.w3.org/1999/xhtml" xhtml:div [[ xhtml:h1 [ "RELAX NG flat schema for our library" ] xhtml:p [[ "This schema has been written by " xhtml:a [ href = "http://dyomedea.com/vdv" "Eric van der Vlist" ] "." ] ] start = element library { book-element+ } .../... |
The syntax used for the Dublin Core example has here been applied recursively and the href attribute has been expressed as href = "http://dyomedea.com/vdv ".
These grammar annotations always represent foreign elements. Another mechanism ("initial annotations") is used to express annotations representing foreign attributes.
Initial annotationsare used to define annotations (through foreign elements or attributes) that will be appended as the first children of the next pattern. This is the option we must always use to define annotations as foreign attributes, such as those used in the XLink example:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> .../... </grammar> |
Initial annotations don't begin with a qualified name, since they apply to the declaration that follows them, not an independent element. The XLink example would therefore be written:
namespace xlink = "http://www.w3.org/1999/xlink" start = [ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "library.xhtml" ] element library { book-element+ } |
Note how the foreign elements have been wrapped within square brackets in the compact syntax and also that the annotations are not included in the element pattern that follows it. Using square brackets to wrap annotations without a name to precede it is what makes it an "initial annotation". Initial annotations can be used with attributes or elements or both. If we combined our Dublin Core example with our XLink example, we could use initial annotations. In RELAX NG XML syntax it would look like:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <dc:title>The library element</dc:title> <dc:author>Eric van der Vlist</dc:author> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> |
While in the Compact Syntax, this would be written:
namespace xlink = "http://www.w3.org/1999/xlink" namespace dc = "http://purl.org/dc/elements/1.1/" start = [[ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "library.xhtml" dc:title [ "The library element" ] dc:author [ "Eric van der Vlist" ] ] element library { book-element+ } |
Again, note how the annotation precedes the element pattern to indicate that they are the first child elements in the XML syntax. This rule also applies to annotations for foreign attributes of the grammar pattern, such as:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="grammar.xhtml"> .../... </grammar> |
In this case, to be able to define the annotations before the grammar pattern, we need to write the grammar pattern explicitly, something which is usually not necessary with the compact syntax:
namespace xlink = "http://www.w3.org/1999/xlink" [ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "grammar.xhtml" ] grammar { .../... } |
How do we define annotations which are neither initial nor grammar annotations? Here's an example. Note that the XHTML element is in the middle of the declaration:
<define name="author-element"> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <xhtml:p>After this point, everything is optional.</xhtml:p> <optional> <ref name="died-element"/> </optional> </element> </define> |
We define annotations that aren't initial or grammar by using a third syntax reserved for following annotations. To make the example above work, we would write:
author-element = element author { attribute id { text }, name-element, born-element >> xhtml:p [ "After this point, everything is optional." ], died-element? } |
Note the new syntax >> xhtml:p [ "After this point, all is optional." ]'. The leading >> signals a following annotation. A following annotation is inserted where it appears as a "following sibling" of the parent element representing the pattern in the XML syntax.
Let's have a look at the following perverse schema snippet where annotations have been added in nearly every location where there was room for them:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:ann="http://dyomedea.com/examples/ns/annotations" ann:attribute="Annotation as foreign attribute for 'grammar'"> <ann:element>Initial annotation as foreign element for "grammar"</ann:element> <start ann:attribute="Annotation as a foreign attribute for 'start'"> <ann:element>Initial annotation as foreign element for "start"</ann:element> <element name="library" ann:attribute="Annotation as a foreign attribute for 'element'"> <ann:element>Initial annotation as foreign element for "element"</ann:element> <oneOrMore ann:attribute="Annotation as a foreign attribute for 'oneOrMore'"> <ann:element>Initial annotation as foreign element for "oneOrMore"</ann:element> <ref name="book-element" ann:attribute="Annotation as a foreign attribute for 'ref'"> <ann:element>Initial annotation as foreign element for "ref"</ann:element> </ref> <ann:element>Following annotation as foreign element for "oneOrMore"</ann:element> </oneOrMore> <ann:element>Following annotation as foreign element for "element"</ann:element> </element> <ann:element>Following annotation as foreign element for "start"</ann:element> </start> <ann:element>Grammar annotation as foreign element for "grammar"</ann:element> .../... </grammar> |
The compact syntax would be:
namespace ann = "http://dyomedea.com/examples/ns/annotations" [[ ann:attribute = 'Annotation as foreign attribute for "grammar"' ann:element [ 'Initial annotation as foreign element for "grammar"' ] ] grammar { [[ ann:attribute = "Annotation as a foreign attribute for 'start'" ann:element [ 'Initial annotation as foreign element for "start"' ] ] start = [[ ann:attribute = "Annotation as a foreign attribute for 'element'" ann:element [ 'Initial annotation as foreign element for "element"' ] ] element library { [[ ann:attribute = "Annotation as a foreign attribute for 'oneOrMore'" ann:element [ 'Initial annotation as foreign element for "oneOrMore"' ] ] ([[ ann:attribute = "Annotation as a foreign attribute for 'ref'" ann:element [ 'Initial annotation as foreign element for "ref"' ] ] book-element >> ann:element [ 'Following annotation as foreign element for "oneOrMore"' ]+) >> ann:element [ 'Following annotation as foreign element for "element"' ] } >> ann:element [ 'Following annotation as foreign element for "start"' ] ann:element [ 'Grammar annotation as foreign element for "grammar"' ] .../... } |
Although the compact syntax is strictly equivalent to the XML syntax, it's difficult to read and tough to say where each of these annotations belongs. I hope that this example has been compelling enough (and, for once, confusing enough) to convince you that even though application specific syntaxes may be defined which are more concise and easier to read than XML, when there is a need for extensibility and interoperability, XML is a clear winner.
A riddle before we move on... What does this annotation mean?
element born { xsd:date { [[ xhtml:p [ "Add new parameters here to define a range." ] ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
It can't be a following annotation on the pattern parameter since parameters have a text only content model and can't accept foreign elements. RELAX NG considers that, in this case, this is a following annotation on the definition of the data content of the born element. This makes the compact syntax riddle equivalent to:
<element name="born"> <data type="date"> <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param> <xhtml:p>Add new parameters here to define a range.</xhtml:p> </data> </element> |
Note that this same issue also arises with the value pattern. With both value and param, the normal syntax using a following annotation cannot be used in the compact syntax.
You may want to annotate a group of patterns. When patterns are definitions of named patterns in a grammar and compositors such as group, interleave or choice cannot be used as containers for the annotation, RELAX NG provides a div pattern in its own namespace for this purpose:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://relaxng.org/ns/structure/1.0"> .../... <div> <xhtml:p>The content of the book element has been split in two named patterns:</xhtml:p> <define name="book-start"> <attribute name="id"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> </define> <define name="book-end"> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> <attribute name="available"/> </define> </div> .../... </grammar> |
or:
[[ xhtml:p [ "The content of the book element has been split in two named patterns:" ] ] div { book-start = attribute id { text }, isbn-element, title-element, author-element* book-end = author-element*, character-element*, attribute available { text } } |
The div pattern has no other effect than to group both definitions of the book element in a container. Annotations can then be applied to a single container instead of being applied as multiple, individual definitions. Each of the embedded definitions is still considered to be global to the grammar. They can still be referenced as if they had not been wrapped into a div pattern.
Using the div element seems like a pretty good idea, but there are other challenges in annotation. One is about taking advantage of more generic mechanisms defined for XML while the second deals with the impossibility of annotating value and param patterns with foreign elements.
There is a tendency in recent XML applications to deprecate the usage of XML comments and Processing Instructions (PIs) and to replace them with XML elements and attributes. There can be some good reasons for doing so. Using elements is more flexible when structured content needs to be added. Also, the lack of namespace support for PIs makes it difficult to rely on names which might have different meanings in different applications. However, it doesn't mean that comments and PIs shouldn't be used in RELAX NG schemas.
Comments are fully supported by RELAX NG. XML comments even have their equivalent in the compact syntax:
<define name="author-element"> <!-- Definition of the author element --> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
becomes, with the help of the # sign:
author-element = # Definition of the author element element author { attribute id { text }, name-element, born-element, died-element? } |
As in Unix shells, comments are marked by a hash (#) in the compact syntax. We could spend forever discussing whether this is better or worse than a counterpart based on foreign elements such as:
<define name="author-element"> <xhtml:p>Definition of the author element</xhtml:p> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
or:
[[ xhtml:p [ "Definition of the author element" ] ] author-element = element author { attribute id { text }, name-element, born-element, died-element? } |
I would argue that the syntax for comments is much more readable in the compact syntax. In the XML syntax, too, comments are more easily spotted when their syntax is different from the XML elements. Readability is, of course, very subjective but there is no reason to avoid comments if you like them. After all, a simple XSLT transformation can transform comments into foreign elements and vice versa. Getting good comments is more important than the syntax used to express them.
Note | |
---|---|
Reading comments in the compact syntax is so much easier than reading annotations that I would recommend always using comments unless there are no other special requirements. |
The issue of choosing between methods of adding Processing Instructions would have been similar for PIs if they had an equivalent in the compact syntax. Unfortunately, PIs do not translate into the compact syntax and are discarded during the conversion. If you want to keep the possibility of using both the XML and the compact syntax, you will need to avoid using PIs. So we have a decision made for us.
Still, if you like PIs, you can use them in the XML syntax. As comments, PIs may be more readable than foreign elements. For instance, if we compare:
<define name="author-element"> <?sql query="select name, birthdate, deathdate from tbl_author"?> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
and:
<define name="author-element" > <sql:select xmlns:sql="http://www.extensibility.com/saf/spec/safsample/sql-map.saf"> select name, birthdate,deathdate from tbl_author </sql:select> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
There doesn't seem to be much reason to prefer the second syntax over the first one, beyond lack of namespace support mentioned and a greater extensibility for foreign elements.
What if we need to annotate value and param patterns which do not accept foreign elements? There isn't much we can do except for using foreign attributes, XML comments, PIs (as seen in the previous section), or move the annotations to another location.
Comments can be used freely in this context:
<element name="born"> <data type="date"> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern"> <!-- We don't want timezones in our dates. --> [0-9]{4}-[0-9]{2}-[0-9]{2} </param> </data> </element> |
Or, in the compact syntax:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" pattern = # We don't want timezones in our dates. "[0-9]{4}-[0-9]{2}-[0-9]{2}\x{a}" } } |
We can also transform the foreign elements we had wanted to create into attributes with the same names, for instance:
<element name="born"> <data type="date"> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern" xhtml:p="We don't want timezones in our dates.">[0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [ xhtml:p = "We don't want timezones in our dates." ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Of course, there is no such thing as an xhtml:p attribute, but the meaning seems straightforward enough, at least to human readers. The downside of both workarounds is that we cannot extend them if we have structured content. We might want to do that if we needed to add a link in our comment. In this case, we will need to locate the comment in a foreign element at a different location:
<element name="born"> <data type="date"> <xhtml:p>We don't want timezones in our dates (see <xhtml:a href="ref.xhtml#dates">dates ref</xhtml:a> for additional info.</xhtml:p> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { [[ xhtml:p [[ "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Note that we have lost the relation between the annotation's link and the annotation's location. One of the ways to get this information back is to add an identifier to the annotation and use a mechanism such as XLink to define a link between our param element and the annotation:
<element name="born"> <data type="date"> <xhtml:p id="dates-notz">We don't want timezones in our dates (see <xhtml:a href="ref.xhtml#dates">dates ref</xhtml:a> for additional info.</xhtml:p> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern" xlink:type="simple" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="#dates-notz" >[0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { [[ xhtml:p [[ id = "dates-notz" "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [ xlink:type = "simple" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "#dates-notz" ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Another option is to change the rules of the game and state that the annotations do es not apply to the parent element, but to the preceding element. For instance, we will see in the next section that RELAX NG's DTD compatibility specification uses the trick of shifting the annotation from the parent element to the preceding element. Applied to our example, this would lead to writing:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [[ xhtml:p [[ "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
You are welcome to use our annotation system to give your feedback.
[Annotations for this page]
All text is copyright Eric van der Vlist, Dyomedea. During development, I give permission for non-commercial copying for educational and review purposes. After publication, all text will be released under the Free Software Foundation GFDL.