by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)
RELAX NG is an XML-based technology. RELAX NG schemas are commonly stored in XML documents (called schema documents) and used to validate other XML documents (called instance documents). While RELAX NG works with and uses XML documents, RELAX NG processors operate at a slightly higher level of abstraction, called an infoset, rather than processing the actual text of the XML document, which is called lexical processing.
An infoset is a logical view of the XML document, rather than the document as stored in a text file. Most XML processors read (or generate) XML syntax but work internally on a representation that omits a lot of details. To take a brief example, from a lexical perspective, which looks at the actual contents of an XML document, <book id='b0836217462' available="true"/> is an empty tag containing two attributes named id and available. The value of id is delimited with single quotes, while the value of available is delimited with double quotes. Yet, from an infoset perspective, this isn't an empty tag with particular syntax; the kind of quotation marks don't matter. It's a book element with an attribute named id and a value of b0836217462, as well as an attribute named available with a value of true. Elements, attributes, and text are often referred to as nodes in this perspective, like nodes in an object tree.
There are a variety of different models for XML documents—specifications such as the Simple API for XML (SAX), the Document Object Model (DOM), and XPath all have slightly different takes on what an infoset is. As a first step toward coordinating these perspectives, the W3C created a Recommendation: the XML Information Set (Infoset), which is available at http://www.w3.org/TR/xml-infoset/. The XML Infoset defines an abstract model of XML documents that uses a hierarchical structure described in terms generic and neutral enough to be acceptable for use with a diverse range of specifications.
Schema languages work at the level of the XML Infoset, and their main goal is to define constraints on a subset of the XML Infoset. Because they work at the XML Infoset level, they can't be used to express constraints on things that don't belong to the XML Infoset. Thus such things as the order of the attributes, their quotation style, or the number of spaces between them can't be constrained by schemas. In addition, RELAX NG, like most schema languages, won't let you define constraints on XML comments, processing instructions, or entity references. Schema languages focus on a core set of features: elements, attributes, and textual content.
This text is released under the Free Software Foundation GFDL.