by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)
RELAX NG doesn't define specific elements and attributes reserved for annotations. Instead, RELAX NG opened its language. RELAX NG permits foreign attributes—attributes from any namespace other than the RELAX NG namespace—to appear on all its elements. RELAX NG also allows elements either from no namespace or from any namespace other than the RELAX NG namespace in all its elements with a content model that is empty or element only. (That excludes all RELAX NG elements except value and param, which have a text-only content model.) RELAX NG is thus strictly following the principles of an open schema presented in the previous chapter.
In the XML syntax, adding annotations is both easy and flexible. It's a very straightforward process to add annotations using foreign elements. For instance, here I've added some Dublin Core (dc) elements to our grammar to identify its title and author:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:dc= "http://purl.org/dc/elements/1.1/"> <dc:title>RELAX NG flat schema for our library</dc:title> <dc:author>Eric van der Vlist</dc:author> <start> <element name="library"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> ... </grammar> |
or perhaps some XHTML documentation:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xhtml= "http://www.w3.org/1999/xhtml"> <xhtml:div> <xhtml:h1>RELAX NG flat schema for our library</xhtml:h1> <xhtml:p>This schema has been written by <xhtml:a href="http://dyomedea.com/vdv">Eric van der Vlist</xhtml:a>.</xhtml:p> </xhtml:div> ... </grammar> |
or perhaps I want to use XLink through attributes:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> ... </grammar> |
RELAX NG itself won't know what to do with this extra information—that's up to processors built specifically for handling the annotations—but it will quietly ignore all this extra information, letting you bundle whatever information you like into the schema without disrupting it.
Annotations are much more challenging to use correctly when using the compact syntax. Because it isn't XML, the compact syntax has no built-in support for this kind of extensibility; an alternative syntax based on square brackets ([]) has been developed to embed XML structures within the compact syntax. Unfortunately, the square brackets and XML aren't a delightful mix with the other punctuation used in the compact syntax. The syntax for including annotations within a schema is slightly different according to their location in the schema.
The easiest annotations to write are for foreign elements in a grammar element. These annotations are called grammar annotations, and they do the same work as the first two examples shown in with the XML syntax. First, the Dublin Core annotations look like this in RELAX NG's XML syntax:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:dc="http://purl.org/dc/elements/1.1/"> <dc:title>RELAX NG flat schema for our library</dc:title> <dc:author>Eric van der Vlist</dc:author> <start> <element name="library"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> ... </grammar> |
For the compact syntax, use the namespace-qualified name of the annotation, followed by a left square bracket, its contents, and a right square bracket. The annotated schema listed earlier is written:
namespace dc = "http://purl.org/dc/elements/1.1/" dc:title [ "RELAX NG flat schema for our library" ] dc:author [ "Eric van der Vlist" ] start = element library { book-element+ } |
The use of the qualified name (dc:title or dc:author) is specific to grammar annotations, while the syntax [ element content ] that represents its content is more generic.
These annotations can have structured content with child elements and attributes. Let's reexamine our XHTML example:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <xhtml:div> <xhtml:h1>RELAX NG flat schema for our library</xhtml:h1> <xhtml:p>This schema has been written by <xhtml:a href="http://dyomedea.com/vdv">Eric van der Vlist</xhtml:a>.</xhtml:p> </xhtml:div> ... </grammar> |
In the compact syntax, I used an approach similar to that used for the Dublin Core example, but with more square brackets to represent nested element and attribute structures:
namespace xhtml = "http://www.w3.org/1999/xhtml" xhtml:div [ xhtml:h1 [ "RELAX NG flat schema for our library" ] xhtml:p [ "This schema has been written by " xhtml:a [ href = "http://dyomedea.com/vdv" "Eric van der Vlist" ] "." ] ] start = element library { book-element+ } ... |
The syntax used for the Dublin Core example has here been applied recursively and the href attribute has been expressed as href = "http://dyomedea.com/vdv".
These grammar annotations always represent foreign elements. Another mechanism (initial annotations) expresses annotations representing foreign attributes.
Initial annotations define annotations (through foreign elements or attributes) that are appended as the first children of the next pattern. This is the option you must always use to define annotations as foreign attributes, such as those used in the XLink example:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> ... </grammar> |
Initial annotations don't begin with a qualified name because they apply to the declaration that follows them, not to an independent element. The XLink example is therefore written:
namespace xlink = "http://www.w3.org/1999/xlink" start = [ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "library.xhtml" ] element library { book-element+ } |
Note how the foreign elements have been wrapped within square brackets in the compact syntax and also that the annotations aren't included in the element pattern that follows it. Using square brackets to wrap annotations without a name to precede it is what makes it an initial annotation. Initial annotations can be used with attributes or elements or both. If I combine the Dublin Core example with the XLink example, I can use initial annotations. In RELAX NG XML syntax, it looks like:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:dc="http://purl.org/dc/elements/1.1/"> <start> <element name="library" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="library.xhtml"> <dc:title>The library element</dc:title> <dc:author>Eric van der Vlist</dc:author> <oneOrMore> <ref name="book-element"/> </oneOrMore> </element> </start> |
namespace xlink = "http://www.w3.org/1999/xlink" namespace dc = "http://purl.org/dc/elements/1.1/" start = [ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "library.xhtml" dc:title [ "The library element" ] dc:author [ "Eric van der Vlist" ] ] element library { book-element+ } |
Again, note how the annotation precedes the element pattern to indicate that they are the first child elements in the XML syntax. This rule also applies to annotations for foreign attributes of the grammar pattern, such as:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" xlink:role="http://www.w3.org/1999/xhtml" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="grammar.xhtml"> ... </grammar> |
In this case, to define the annotations before the grammar pattern, I need to write the grammar pattern explicitly, something usually unnecessary with the compact syntax:
namespace xlink = "http://www.w3.org/1999/xlink" [ xlink:type = "simple" xlink:role = "http://www.w3.org/1999/xhtml" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "grammar.xhtml" ] grammar { ... } |
Here's an example of how to define annotations that are neither initial nor grammar annotations. Note that the XHTML element is in the middle of the declaration:
<define name="author-element"> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <xhtml:p>After this point, everything is optional.</xhtml:p> <optional> <ref name="died-element"/> </optional> </element> </define> |
You can define annotations that aren't initial or grammar using a third syntax reserved for following annotations. Here's how to make the previous example work:
author-element = element author { attribute id { text }, name-element, born-element >> xhtml:p [ "After this point, everything is optional." ], died-element? } |
Note the new syntax >> xhtml:p [ "After this point, all is optional." ]'. The leading >> signals a following annotation. A following annotation is inserted where it appears as a "following sibling" of the parent element representing the pattern in the XML syntax.
In the following perverse schema snippet, annotations have been added in nearly every location where there was room for them:
<?xml version="1.0" encoding="utf-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:ann="http://dyomedea.com/examples/ns/annotations" ann:attribute="Annotation as foreign attribute for 'grammar'"> <ann:element>Initial annotation as foreign element for "grammar"</ann:element> <start ann:attribute="Annotation as a foreign attribute for 'start'"> <ann:element>Initial annotation as foreign element for "start"</ann:element> <element name="library" ann:attribute="Annotation as a foreign attribute for 'element'"> <ann:element>Initial annotation as foreign element for "element"</ann:element> <oneOrMore ann:attribute="Annotation as a foreign attribute for 'oneOrMore'"> <ann:element>Initial annotation as foreign element for "oneOrMore"</ann: element> <ref name="book-element" ann:attribute="Annotation as a foreign attribute for 'ref'"> <ann:element>Initial annotation as foreign element for "ref"</ann:element> </ref> <ann:element>Following annotation as foreign element for "oneOrMore"</ann: element> </oneOrMore> <ann:element>Following annotation as foreign element for "element"</ann: element> </element> <ann:element>Following annotation as foreign element for "start"</ann:element> </start> <ann:element>Grammar annotation as foreign element for "grammar"</ann:element> .../ </grammar> |
or, in the compact syntax:
namespace ann = "http://dyomedea.com/examples/ns/annotations" [ ann:attribute = 'Annotation as foreign attribute for "grammar"' ann:element [ 'Initial annotation as foreign element for "grammar"' ] ] grammar { [ ann:attribute = "Annotation as a foreign attribute for 'start'" ann:element [ 'Initial annotation as foreign element for "start"' ] ] start = [ ann:attribute = "Annotation as a foreign attribute for 'element'" ann:element [ 'Initial annotation as foreign element for "element"' ] ] element library { [ ann:attribute = "Annotation as a foreign attribute for 'oneOrMore'" ann:element [ 'Initial annotation as foreign element for "oneOrMore"' ] ] ([ ann:attribute = "Annotation as a foreign attribute for 'ref'" ann:element [ 'Initial annotation as foreign element for "ref"' ] ] book-element >> ann:element [ 'Following annotation as foreign element for "oneOrMore"' ]+) >> ann:element [ 'Following annotation as foreign element for "element"' ] } >> ann:element [ 'Following annotation as foreign element for "start"' ] ann:element [ 'Grammar annotation as foreign element for "grammar"' ] .../... } |
Although the compact syntax is strictly equivalent to the XML syntax, it's difficult to read and tough to specify where each of these annotations belongs. I hope that this example has been compelling enough (and, for once, confusing enough) to convince you that even though application-specific syntaxes that are more concise and easier to read than XML can be defined, when there is a need for extensibility and interoperability, XML is a clear winner.
A riddle before we move on: what does this annotation mean?
element born { xsd:date { [ xhtml:p [ "Add new parameters here to define a range." ] ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
It can't be a following annotation on the pattern parameter, because parameters have a text-only content model and can't accept foreign elements. RELAX NG concludes that, in this case, the example is a following annotation on the definition of the data content of the born element. This answer makes the compact syntax riddle equivalent to:
<element name="born"> <data type="date"> <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param> <xhtml:p>Add new parameters here to define a range.</xhtml:p> </data> </element> |
Note that this same issue also arises with the value pattern. With both value and param, the normal syntax using a following annotation can't be used in the compact syntax.
You might want to annotate a group of patterns. When patterns are definitions of named patterns in a grammar, and compositors such as group, interleave, or choice can't be used as containers for the annotation, RELAX NG provides a div pattern in its own namespace for this purpose:
<?xml version="1.0" encoding="UTF-8"?> <grammar xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://relaxng.org/ns/structure/1.0"> ... <div> <xhtml:p>The content of the book element has been split into two named patterns:</xhtml:p> <define name="book-start"> <attribute name="id"/> <ref name="isbn-element"/> <ref name="title-element"/> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> </define> <define name="book-end"> <zeroOrMore> <ref name="author-element"/> </zeroOrMore> <zeroOrMore> <ref name="character-element"/> </zeroOrMore> <attribute name="available"/> </define> </div> ... </grammar> |
or:
[ xhtml:p [ "The content of the book element has been split into two named patterns:" ] ] div { book-start = attribute id { text }, isbn-element, title-element, author-element* book-end = author-element*, character-element*, attribute available { text } } |
The div pattern has no other effect than to group both definitions of the book element in a container. Annotations can then be applied to a single container instead of being applied as multiple individual definitions. Each embedded definition is still considered global to the grammar; they can still be referenced as if they hadn't been wrapped into a div pattern.
Using the div element seems like a pretty good idea, but there are other challenges in annotation. One takes advantage of more generic mechanisms defined for XML, while the second deals with the impossibility of annotating value and param patterns with foreign elements.
There is a tendency in recent XML applications to deprecate the usage of XML comments and processing instructions (PIs) and to replace them with XML elements and attributes. There are sometimes good reasons for doing so. Using elements is more flexible when structured content needs to be added. Also, the lack of namespace support for PIs makes it difficult to rely on names that might have different meanings in different applications. However, these reasons don't mean that comments and PIs shouldn't be used in RELAX NG schemas.
Comments are fully supported by RELAX NG. XML comments even have their equivalent in the compact syntax:
<define name="author-element"> <!-- Definition of the author element --> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
which becomes, with the help of the # sign:
author-element = # Definition of the author element element author { attribute id { text }, name-element, born-element, died-element? } |
As in Unix shells, comments are marked by a hash (#) in the compact syntax. I could discuss forever whether this is better or worse than a counterpart based on foreign elements such as:
<define name="author-element"> <xhtml:p>Definition of the author element</xhtml:p> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
or:
[ xhtml:p [ "Definition of the author element" ] ] author-element = element author { attribute id { text }, name-element, born-element, died-element? } |
I would argue that the syntax for comments is much more readable in the compact syntax. In the XML syntax too, comments are more easily spotted when their syntax is different from the XML elements. Readability is of course very subjective, but there is no reason to avoid comments if you like them. After all, a simple XSLT transformation can transform comments into foreign elements and vice versa. Getting good comments is more important than the syntax used to express them.
Tip | |
---|---|
Reading comments in the compact syntax is so much easier than reading annotations that I recommend always using comments unless there are no other special requirements. |
The same recommendation would hold for choosing between methods of adding processing instructions if they had an equivalent in the compact syntax. Unfortunately, PIs don't translate into the compact syntax and are discarded during the conversion. If you want to keep the option of using both the XML and the compact syntax, you will need to avoid using PIs. So a decision has been made for you.
Still, if you like PIs, you can use them in the XML syntax. As comments, PIs can be more readable than foreign elements. For instance, compare:
<define name="author-element"> <?sql query="select name, birthdate, deathdate from tbl_author"?> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
and:
<define name="author-element" > <sql:select xmlns:sql="http://www.extensibility.com/saf/spec/safsample/sql-map.saf"> select name, birthdate,deathdate from tbl_author </sql:select> <element name="author"> <attribute name="id"/> <ref name="name-element"/> <ref name="born-element"/> <optional> <ref name="died-element"/> </optional> </element> </define> |
There doesn't seem to be much reason to prefer the second syntax over the first one, beyond lack of namespace support mentioned and a greater extensibility for foreign elements.
What if you need to annotate value and param patterns that don't accept foreign elements? There isn't much you can do except use foreign attributes, XML comments, PIs (as seen in the previous section), or move the annotations to another location.
Comments can be used freely in this context:
<element name="born"> <data type="date"> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern"> <!-- We don't want timezones in our dates. --> [0-9]{4}-[0-9]{2}-[0-9]{2} </param> </data> </element> |
or, in the compact syntax:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" pattern = # We don't want timezones in our dates. "[0-9]{4}-[0-9]{2}-[0-9]{2}\x{a}" } } |
You can also transform the foreign elements you want to create into attributes with the same names, for instance:
<element name="born"> <data type="date"> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern" xhtml:p="We don't want timezones in our dates."> [0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [ xhtml:p = "We don't want timezones in our dates." ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Of course, there is no such thing as an xhtml:p attribute, but the meaning seems straightforward enough, at least to human readers. The downside of both workarounds is that you can't extend them if you have structured content. You might want to do that if you need to add a link in your comment. In this case, you need to locate the comment in a foreign element at a different location:
<element name="born"> <data type="date"> <xhtml:p>We don't want timezones in our dates (see <xhtml:a href="ref.xhtml#dates">dates ref</xhtml:a> for additional info.</xhtml:p> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern">[0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { [ xhtml:p [ "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Note that this example has lost the relation between the annotation's link and the annotation's location. One of the ways to get this information back is to add an identifier to the annotation and use a mechanism such as XLink to define a link between the param element and the annotation:
<element name="born"> <data type="date"> <xhtml:p id="dates-notz">We don't want timezones in our dates (see <xhtml:a href="ref.xhtml#dates">dates ref</xhtml:a> for additional info.</xhtml:p> <param name="minInclusive">1900-01-01</param> <param name="maxInclusive">2099-12-31</param> <param name="pattern" xlink:type="simple" xlink:arcrole="http://www.rddl.org/purposes#reference" xlink:href="#dates-notz" >[0-9]{4}-[0-9]{2}-[0-9]{2}</param> </data> </element> |
or:
element born { [ xhtml:p [ id = "dates-notz" "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [ xlink:type = "simple" xlink:arcrole = "http://www.rddl.org/purposes#reference" xlink:href = "#dates-notz" ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
Another option is to change the rules of the game and state that the annotation doesn't apply to the parent element, but to the preceding element. For instance, you will see in the next section that RELAX NG's DTD compatibility specification uses the trick of shifting the annotation from the parent element to the preceding element. Applied to our example:
element born { xsd:date { minInclusive = "1900-01-01" maxInclusive = "2099-12-31" [ xhtml:p [ "We don't want timezones in our dates (see " xhtml:a [ href = "ref.xhtml#dates" "dates ref" ] " for additional info." ] ] pattern = "[0-9]{4}-[0-9]{2}-[0-9]{2}" } } |
This text is released under the Free Software Foundation GFDL.