Chapter 5: Flattening Our First Schema

RELAX NG by Eric van der Vlist will be published by O'Reilly & Associates (ISBN: 0596004214)

Chapter 5: Flattening Our First Schema
Prevï¿½	ï¿½	ï¿½Next

If we look at the structure of our first schema, the Russian doll style schema, we see that it follows the structure of the instance document it applies to, as was shown in Figure 3-1. Writing our first schema has pretty much been limited to inserting text, element or attribute elements into the schema each time we've encountered a text node, element or attribute in the instance document. This method of creating our schema could be seen as a serialization of the XML infoset (i.e. of the structure available in the document) and could, therefore, be easily automated.

	Note
	Automated serialization is the principle behind Examplotron, a program which will be described in Chapter 14: Generating RELAX NG schemas.

There are a couple of drawbacks to modeling documents with the Russian doll style of schemas, however. First, they are not modular and, therefore, they become difficult to read and maintain when documents are large or complex. Second, they cannot represent recursive--self-referencing-- models. (Lists which may themselves contain lists are a common case of this.)

The lack of modularity can be seen in a document as simple as our first schema, which was shown in Example 3-1. We have a name element which uses the same model within both the characterand author elements.

Figureï¿½1 shows how in our first schema, we needed to give the definition of what name means in each context:

Figureï¿½1.ï¿½Two different definitions of name in the same schema.

You might think that the extra text we added won't make a difference, but that's not completely true. The additional verbosity here is innocuous because the definition of the name element is simple, thus, not verbose. The principle is the same if the definition is complex, however. It will require redundancy. This redundancy makes maintenance of the schema more error prone. If I need to update the definition of the name element, I'll need to update it as many times at it appears, but I'll give myself more room for mistakes. Common sense applies the same rules to XML schema languages as to any programming language. Limiting repetitive work makes developers happy!

Another rule borrowed from programming languages concerns recursive models. Recursive models, models that reference themselves, are those in which, for example, div elements can be embedded within other div elements without any restriction in the number of levels of embeddedness. We could just copy the definition of the div element again and again, but it's both inefficient and limiting. We need a way to define and reference the content model of the div element recursively. In the course of this chapter we will examine cases of both modularity and recursive models.

You are welcome to use our annotation system to give your feedback.
[Annotations for this page]
All text is copyright Eric van der Vlist, Dyomedea. During development, I give permission for non-commercial copying for educational and review purposes. After publication, all text will be released under the Free Software Foundation GFDL.

Prevï¿½	Up	ï¿½Next
XML or compact?ï¿½	Home	ï¿½Defining named patterns