Restrictions

RELAX NG by Eric van der Vlist will be published by O'Reilly & Associates (ISBN: 0596004214)

Restrictions
Prev�	Chapter 15: Simplification And Restrictions	�Next

Restrictions

With the exception of constraints expressed by the RELAX NG schema for RELAX NG and those which are part of the simplification itself, RELAX NG defines all the restrictions on schema structres as they apply to the simplified version. Most of them are obvious and easy to understand.

Constraints on attributes

RELAX NG's constraints match the constraints on attributes defined by the XML 1.0 recommendation:

Attributes can't contain other attributes: attribute patterns can't have another attribute pattern in their descendants.
Attributes can't contain elements: attribute patterns can't have a ref pattern in their descendants.
Attributes can't be duplicated: an attribute may not be found in a oneOrMore pattern with a combination by group or interleave. Furthermore, if attribute patterns are combined in a group or interleave pattern, their name classes must not overlap: they cannot have any name which belongs to both name classes.
Attributes which have an infinite name class ( anyName or nsName) must be enclosed in a oneOrMore pattern. In other words, we can't specify that we want to allow only one or a certain number of occurrences of these attributes. They can only have text as their model (in other words, data patterns are forbidden here).

Let's explore schemas which may look valid at a quick glance but are going to collide with these restrictions.

Bad Example: attribute content model

This schema states that any content model can be accepted in the bar attribute:

 anything =
   (element * { anything }
    | attribute * { text }
    | text)*
 start =
   element foo {
     attribute bar { anything },
     text
   }

Unfortunately, it's translated into:

 start = __foo-elt-id2602800
 __-elt-id2602788 =
   element * {
     empty
     | ((__-elt-id2602788
         | attribute * { text })
        | text)+
   }
 __foo-elt-id2602800 =
   element foo {
     attribute bar {
       empty
       | ((__-elt-id2602788
           | attribute * { text })
          | text)+
     },
     text
   }

This one allows a reference to a named pattern (which means an element in the simplified syntax) and an attribute. Both of these things are forbidden.

We must ensure that the anything defined for the content of the attribute is compatible with the content of attributes as defined by the XML specification. For instance:

 anything =
   (text)
 start =
   element foo {
     attribute bar { anything },
     text
   }

which will be simplified into:

 start = __foo-elt-id2602296
 __foo-elt-id2602296 =
   element foo {
     attribute bar { text },
     text
   }

This schema expresses the original intent and it is valid.

Bad Example: attribute duplication

Let's say we want to extend the definition of our title element to have the same attributes and content model as the XHTML 2.0 span element. If we look into the RELAX NG module implementing the span element, we can see that its definition is:

  span = element span { span.attlist, Inline.model }

We want to include this in the definition of the title element, which already includes an xml:lang attribute:

 namespace x = "http://www.w3.org/2002/06/xhtml2"
 
 start = book
 include "xhtml-attribs-2.rnc" inherit = x
 include "xhtml-inltext-2.rnc" inherit = x
 include "xhtml-datatypes-2.rnc" inherit = x
 book =
   element book {
     attribute id { text },
     attribute available { text },
     element isbn { text },
     element title {
       attribute xml:lang { xsd:language },
       span.attlist,
       Inline.model
     }
   }

Unfortunately, this is invalid because the xml:lang attribute is already included somewhere in the span.attlist pattern. It gets combined during the simplification which causes the definition of the title element to be:

 __title-elt-id2641936 =
  element title {
    (attribute xml:lang { xsd:language },
     (((((((((empty
              | attribute id { xsd:ID }),
             (empty
              | attribute class { xsd:NMTOKENS })),
            (empty
             | attribute title { text })),
           (empty
            | attribute xml:lang { xsd:language })),
          (empty
           | attribute dir {
               (("ltr" | "rtl") | "lro")
               | "rlo"
             })),
         ((empty
           | attribute edit {
               (("inserted" | "deleted") | "changed")
               | "moved"
             }),
          (empty default namespace lib = "http://eric.van-der-vlist.com/ns/library"
 namespace local = ""
 
 start = book
        
 book =
   element book {
     attribute id { text },
     attribute available { text },
     foreign-attributes,
     element isbn { text },
     element title {
       attribute xml:lang { xsd:language },
       text
     }
   }
        
   foreign-attributes = attribute * - (local:* | lib:* ) { text }*
           | attribute datetime { xsd:dateTime }))),
        ((((((((empty
                | attribute href { xsd:anyURI }),
               (empty
                | attribute cite { xsd:anyURI })),
              (empty
               | attribute target { xsd:NMTOKEN })),
             (empty
              | attribute rel { xsd:NMTOKENS })),
            (empty
             | attribute rev { xsd:NMTOKENS })),
           (empty
            | attribute accesskey {
                xsd:string { length = "1" }
              })),
          (empty
           | attribute navindex {
               xsd:nonNegativeInteger {
                 pattern = "0-9+"
                 minInclusive = "0"
                 maxInclusive = "32767"
               }
             })),
         (empty
          | attribute base { xsd:anyURI }))),
       ((empty
         | attribute src { xsd:anyURI }),
        (empty
         | attribute type { text }))),
      ((((empty
          | attribute usemap { xsd:anyURI }),
         (empty
          | attribute ismap { "ismap" })),
        (empty
         | attribute shape {
             (("rect" | "circle") | "poly")
             | "default"
           })),
       (empty
        | attribute coords { text })))),
    (empty
     | (empty
        | (text
           | (((((((((((((abbr-id2635861 | cite-id2635889)
                         | code-id2635918)
                        | dfn-id2635947)
                       | em-id2635975)
                      | kbd-id2636004)
                     | l-id2636032)
                    | quote-id2636061)
                   | samp-id2636090)
                  | span-id2636118)
                 | strong-id2636147)
                | sub-id2636176)
               | sup-id2636204)
              | var-id2636233)))+)
  }

To fix this, we need to remove the xml:lang from our original definition, creating:

 namespace x = "http://www.w3.org/2002/06/xhtml2"
 
 start = book
 include "xhtml-attribs-2.rnc" inherit = x
 include "xhtml-inltext-2.rnc" inherit = x
 include "xhtml-datatypes-2.rnc" inherit = x
 book =
   element book {
     attribute id { text },
     attribute available { text },
     element isbn { text },
     element title {
       span.attlist,
       Inline.model
     }
   }

Bad Example: name class overlap

Let's say we have the following schema, called book.rnc:

 default namespace lib = "http://eric.van-der-vlist.com/ns/library"
 namespace local = ""
 
 start = book
        
 book =
   element book {
     attribute id { text },
     attribute available { text },
     foreign-attributes,
     element isbn { text },
     element title {
       attribute xml:lang { xsd:language },
       text
     }
   }
        
   foreign-attributes = attribute * - (local:* | lib:* ) { text }*

Although we have accepted foreign attributes, we should be more precise about the definition of some Dublin Core elements. We can extend our schema like this:

 namespace dc="http://purl.org/dc/elements/1.1/"
        
 include "book.rnc"
 
 book.content &= attribute dc:rights { text } ?

Unfortunately, this is invalid because it gets simplified as:

 book-id2604347 =
   element book {
     ((((attribute id { text },
         attribute available { text }),
        (empty
         | attribute * - (lib:* | local:*) { text }+)),
       __isbn-elt-id2604556),
      __title-elt-id2604551)
     & attribute ns1:rights { text }
   }

The attribute dc:rights is included in the name class "* - (lib:* | local:*)". To fix this, we need to redefine the named pattern foreign-attributes to remove the name dc:rights or perhaps even all the namespaces for Dublin Core elements:

 default namespace lib = "http://eric.van-der-vlist.com/ns/library"
 namespace dc="http://purl.org/dc/elements/1.1/"
 namespace local = ""
 
 include "book.rnc" {
 	foreign-attributes = attribute * - (local:* | lib:* | dc:* ) { text }*
 }
 
 book.content &= attribute dc:rights { text } ?

Constraints on lists

Lists work on text nodes by splitting them into tokens which are then handled as text nodes. It's therefore not possible to find elements or attributes in a list. Mixing text nodes and embedded lists would be confusing and are forbidden anyway:

List patterns cannot have any of these descendants: list, ref (because after simplification, access to elements is done using references to named patterns), attribute, or text. The interleave pattern is also forbidden as a descendant of list patterns because it would complicate implementations.

Bad Example: list and interleave

Let's say we'd like to define a price element as allowing a numeric followed by a token, such as:

 <price>1 Euro</price>

or a token followed by a numeric:

 <price>USD 1</price>

We might be tempted to write:

 element price {
   list { xsd:decimal & xsd:token }
 }

But this would be invalid because interleave is forbidden in a list. To work around this limitation, we need to give all the possible combinations. It's easy with this small example, though it can rapidly grow out of control as more types are added. In this case, it just requires a bit of duplication:

 element price {
   list { (xsd:decimal, xsd:token) | (xsd:token, xsd:decimal) }
 }

 start = book
 book = element book { book.content }
 book.content =
   attribute id { text },
   attribute available { text },
   element isbn { text },
   title
 title = element title { title.attributes, title.content }
 title.attributes = attribute xml:lang { xsd:language }
 title.content = text

To add the XHTML Inline.model to title.content we might be tempted to write:

 
 include "book.rnc"
 include "xhtml-attribs-2.rnc" 
 include "xhtml-inltext-2.rnc" 
 include "xhtml-datatypes-2.rnc" 
 
 title.content &= Inline.model

Unfortunately, Inline.model already contains a text pattern and gets simplified to:

 title-id2635741 =
  element title {
    attribute lang { xsd:language },
    (text
     & (empty
        | (empty
           | (text
              | (((((((((((((abbr-id2636549 | cite-id2636578)
                            | code-id2636607)
                           | dfn-id2636636)
                          | em-id2636664)
                         | kbd-id2636693)
                        | l-id2636721)
                       | quote-id2636750)
                      | samp-id2636778)
                     | span-id2636807)
                    | strong-id2636836)
                   | sub-id2636865)
                  | sup-id2636893)
                 | var-id2636922)))+))
  }

We have text patterns within interleave. To fix this problem, we need to replace our combination with a redefinition of title.content:

 include "book.rnc" {
   title.content = Inline.model
 }
 include "xhtml-attribs-2.rnc" 
 include "xhtml-inltext-2.rnc" 
 include "xhtml-datatypes-2.rnc"  
 include "book.rnc" {
   title.content = Inline.model
 }
 include "xhtml-attribs-2.rnc" 
 include "xhtml-inltext-2.rnc" 
 include "xhtml-datatypes-2.rnc"

There is no loss in expressive power (we are able to describe what we wanted to describe), but there is a loss in modularity. Changes made to title.content in "book.rnc" would now have to be manually added to our derived schema.

You are welcome to use our annotation system to give your feedback.
[Annotations for this page]
All text is copyright Eric van der Vlist, Dyomedea. During development, I give permission for non-commercial copying for educational and review purposes. After publication, all text will be released under the Free Software Foundation GFDL.

Prev�	Up	�Next
Simplification�	Home	�Chapter 16: Determinism and Datatype Assignment