by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)


RELAX NG


Table of Contents

Foreword by James Clark
Foreword by Murata Makoto
Preface
Who Should Read This Book?
Who Shouldn't Read This Book?
Organization of This Book
Conventions Used in This Book
Comments and Questions
Powered by WikiML
Acknowledgments
I. Tutorial
1. What RELAX NG Offers
1.1. Diversity
1.2. Keeping Documents Independent of Applications
1.3. Validation Has Many Aspects
1.4. The Best Way to Validate XML Document Structures
1.5. RELAX NG's Diverse Applications
1.6. RELAX NG as a Pivot Format
1.7. Why Use Other Schema Languages?
2. Simple Foundations Are Beautiful
2.1. Documents and Infosets
2.2. Different Types of Schema Languages
2.3. A Simple Example
2.4. A Strong Mathematical Background
2.5. Patterns, and Only Patterns
3. First Schema
3.1. Getting Started
3.2. First Patterns
3.2.1. The text Pattern
3.2.2. The attribute Pattern
3.2.3. The element Pattern
3.2.4. The optional Pattern
3.2.5. The oneOrMore Pattern
3.2.6. The zeroOrMore Pattern
3.3. Complete Schema
3.3.1. Constraining Number of Occurrences
3.3.2. Creating "Russian Doll" Schemas
4. Introducing the Compact Syntax
4.1. First Compact Patterns
4.1.1. The text Pattern
4.1.2. The attribute Pattern
4.1.3. Element
4.1.4. The optional Pattern
4.1.5. The oneOrMore Pattern
4.1.6. The zeroOrMore Pattern
4.2. Full Schema
4.3. XML or Compact?
5. Flattening the First Schema
5.1. Defining Named Patterns
5.2. Referencing Named Patterns
5.3. The grammar and start Elements
5.4. Assembling the Parts
5.5. Problems That Never Arise
5.6. Recursive Models
5.7. Escaping Named Pattern Identifiers in the Compact Syntax
6. More Complex Patterns
6.1. The group Pattern
6.2. The interleave Pattern
6.3. The choice Pattern
6.4. Pattern Compositions
6.5. Order Variation as a Source of Information
6.6. Text and Empty Patterns, Whitespace, and Mixed Content
6.7. Why Is It Called interleave?
6.8. Mixed Content Models with Order
6.9. A Restriction Related to interleave
6.10. A Missing Pattern: Unordered Group
7. Constraining Text Values
7.1. Fixed Values
7.2. Co-Occurrence Constraints
7.3. Enumerations
7.4. Whitespace and RELAX NG Native Datatypes
7.5. Using String Datatypes in Attribute Values
7.6. When to Use String Datatypes
7.7. Using Different Types in Each Value
7.8. Exclusions
7.9. Lists
7.10. Data Versus Text
8. Datatype Libraries
8.1. W3C XML Schema Type Library
8.1.1. The Datatypes
8.1.2. Facets
8.2. DTD Compatibility Datatypes
8.3. Which Library Should Be Used?
8.3.1. Native Types Versus W3C XML Schema Datatypes
8.3.2. DTD Versus W3C XML Schema Datatypes
9. Using Regular Expressions to Specify Simple Datatypes
9.1. A Swiss Army Knife
9.2. The Simplest Possible Pattern Facets
9.3. Quantifying
9.4. More Atoms
9.4.1. Special Characters
9.4.2. Wildcard
9.4.3. Character Classes
9.4.4. Or-ing and Grouping
9.5. Common Patterns
9.5.1. String Datatypes
9.5.2. Numeric and Float Types
9.5.3. Datetimes
10. Creating Building Blocks
10.1. Using External References
10.1.1. With Russian Doll Schemas
10.1.2. With Flat Schemas
10.1.3. Embedding Grammars
10.1.4. Referencing Patterns in Parent Grammars
10.2. Merging Grammars
10.2.1. Merging Without Redefinition
10.2.2. Merging and Replacing Definitions
10.2.3. Combining Definitions
10.2.4. Why Can't Definitions Be Defined by Group?
10.3. A Real-World Example: XHTML 2.0
10.4. Other Options
10.4.1. A Possible Use Case
10.4.2. XML Tools
10.4.3. Text Tools
11. Namespaces
11.1. A Ten-Minute Guide to XML Namespaces
11.2. The Two Challenges of Namespaces
11.3. Declaring Namespaces in Schemas
11.3.1. Using the Default Namespace
11.3.2. Using Prefixes
11.4. Accepting Foreign Namespaces
11.4.1. Constructing a Wildcard
11.4.2. Using Wildcards
11.4.3. Where Should Foreign Nodes Be Allowed?
11.4.4. Traps to Avoid
11.4.5. Adding Foreign Nodes Through Combination
11.5. Namespaces, Building Blocks, and Chameleon Design
11.5.1. Reexamining XHTML 2.0
11.5.2. Putting a Chameleon in the Library
11.5.3. Good Chameleon or Evil Chameleon?
12. Writing Extensible Schemas
12.1. Extensible Schemas
12.1.1. Working from a Fixed Result
12.1.2. Free Formats
12.1.3. Restricting Existing Schemas
12.2. The Case for Open Schemas
12.2.1. More Name Classes
12.3. Extensible and Open?
13. Annotating Schemas
13.1. Common Principles for Annotating RELAX NG Schemas
13.1.1. Annotation Using the XML Syntax
13.1.2. Annotations Using the Compact Syntax
13.1.3. Annotating Groups of Definitions
13.1.4. Alternatives and Workarounds
13.2. Documentation
13.2.1. Comments
13.2.2. RELAX NG DTD Compatibility Comments
13.2.3. XHTML Annotations
13.2.4. DocBook Annotations
13.2.5. Dublin Core Annotations
13.2.6. SVG Annotations
13.2.7. RDDL Annotations
13.3. Annotation for Applications
13.3.1. Annotations for Preprocessing
13.3.2. Annotations for Conversion
13.3.3. Annotations for Extension
14. Generating RELAX NG Schemas
14.1. Examplotron: Instance Documents as Schemas
14.1.1. Ten-Minute Guide to Examplotron
14.1.2. Use Cases
14.2. Literate Programming
14.2.1. Out of the Box
14.2.2. Adding Bells and Whistles for RDDL
14.3. UML
14.4. Spreadsheets
15. Simplification and Restrictions
15.1. Simplification
15.1.1. Annotation Removal, Whitespace and Attribute Normalization, and Inheritance
15.1.2. Retrieval of External Schemas
15.1.3. Name Class Normalization
15.1.4. Pattern Normalization
15.1.5. First Set of Constraints
15.1.6. Grammar Merge
15.1.7. Schema Flattening
15.1.8. Final Cleanup
15.2. Restrictions
15.2.1. Constraints on Attributes
15.2.2. Constraints on Lists
15.2.3. Constraints on Except Patterns
15.2.4. Constraints on Start Patterns
15.2.5. Constraints on Content Models
15.2.6. Limitations on interleave
16. Determinism and Datatype Assignment
16.1. What Is Ambiguity?
16.1.1. Ambiguity Versus Determinism
16.1.2. Different Kinds of Ambiguity
16.2. The Downsides of Ambiguous and Nondeterministic Content Models
16.2.1. Instance Annotations
16.2.2. Compatibility with W3C XML Schema
16.3. Some Ideas to Make Disambiguation Easier
16.3.1. Generalizing the Except Pattern
16.3.2. Making Disambiguation Rules Explicit
16.3.3. Accepting Ambiguity
II. Reference
17. Element Reference
17.1. Elements
anyName - Name class accepting any name
attribute - Pattern matching an attribute
choice (in the context of a name-class) - Choice between name classes
choice (in the context of a pattern) - choice pattern
data - data pattern
define - Named pattern definition
div (in the context of a grammar-content) - Division (in the context of a grammar)
div (in the context of a include-content) - Division (in the context of an include)
element - Pattern matching an element
empty - Empty content
except (in the context of a except-name-class) - Remove a name class from another
except (in the context of a pattern) - Remove a set of values from a data
externalRef - Reference to an external schema
grammar - Grammar pattern
group - group pattern
include - Grammar merge
interleave - interleave pattern
list - Text node split
mixed - Pattern for mixed content models
name - Name class for a single name
notAllowed - Not allowed
nsName - Name class for any name in a namespace
oneOrMore - oneOrMore pattern
optional - optional pattern
param - Datatype parameter
parentRef - Reference to a named pattern from the parent grammar
ref - Reference to a named pattern
start - Start of a grammar
text - Pattern-matching text nodes
value - Match a text node and a value
zeroOrMore - zeroOrMore pattern
18. Compact Syntax Reference
18.1. EBNF Production Reference
"""...""" - literal segmentenclosed in three double quotesLiteral segment enclosed in three quotesdoubledouble quotes
"..." - literal segmentenclosed in double quotesLiteral segment enclosed in double quotes
'''...''' - literal segmentenclosed in three single quotesLiteral segment enclosed in three single quotessinglequotes
'...' - literal segmentenclosed in single quotesLiteral segment enclosed in single quotes
(nameClass) - Container
(pattern) - Container
*-nameClass - Name class accepting any name
-nameClass - Remove a name class from another
-pattern - Remove a set of values from a data pattern
CName - Colonized names
QuotedIdentifier - Quoted identifier
Top level - Top level
assignMethod - Defines how to assign content to start and named patterns
attribute - Pattern matching an attribute
datatypeName - Datatype name
datatypeName literal - Matches a text node and a value
datatypeName param exceptPattern - data pattern
datatypes - Namespace declaration (to identify datatype libraries)
decl - Declarations
default namespace - Default namespace declaration
div - Division (in the context of a grammar)
element - Pattern matching an element
empty - Empty content
external - Reference to an external schema
grammar - Grammar pattern
grammarContent - Content of a grammar
identifier - Identifier
identifier assignMethod pattern - Named pattern definition
identifierOrKeyword - Identifier or keyword
include - Grammar merge
includeContent - Content of an include pattern
inherit - Namespace inheritance
keyword - Keywords
list - Text node split
literal - Literal
literalSegment - Literal segment
mixed - Pattern for mixed content models
name - Define a set of names that must be matched by an element or attribute
nameClass - Define a set of names that must be matched by an element or attribute
nameClass|nameClass - Choice between name classes
namespace - Namespace declaration
namespaceURILiteral - Namespace URI Literal
notAllowed - Not allowed
nsName exceptNameClass - Name class for any name in a namespace
param - Datatype parameter
parent - Reference to a named pattern from the parent grammar
pattern - Pattern
pattern&pattern - interleave pattern
pattern* - zeroOrMore pattern
pattern+ - oneOrMore pattern
pattern,pattern - pattern,pattern pattern
pattern? - optional pattern
pattern|pattern - choice pattern
start - Start of a grammar
text - Pattern matching text nodes
19. Datatype Reference
xsd:anyURI - URI (Uniform Resource Identifier)
xsd:base64Binary - Binary content coded as "base64"
xsd:boolean - Boolean (true or false)
xsd:byte - Signed value of 8 bits
xsd:date - Gregorian calendar date
xsd:dateTime - Instant of time (Gregorian calendar)
xsd:decimal - Decimal numbers
xsd:double - IEEE 64-bit floating-point
xsd:duration - Time durations
xsd:ENTITIES - Whitespace-separated list of unparsed entity references
xsd:ENTITY - Reference to an unparsed entity
xsd:float - IEEE 32-bit floating-point
xsd:gDay - Recurring period of time: monthly day
xsd:gMonth - Recurring period of time: yearly month
xsd:gMonthDay - Recurring period of time: yearly day
xsd:gYear - Period of one year
xsd:gYearMonth - Period of one month
xsd:hexBinary - Binary contents coded in hexadecimal
xsd:ID - Definition of unique identifiers
xsd:IDREF - Definition of references to unique identifiers
xsd:IDREFS - Definition of lists of references to unique identifiers
xsd:int - 32-bit signed integers
xsd:integer - Signed integers of arbitrary length
xsd:language - RFC 1766 language codes
xsd:long - 64-bit signed integers
xsd:Name - XML 1.O name
xsd:NCName - Unqualified names
xsd:negativeInteger - Strictly negative integers of arbitrary length
xsd:NMTOKEN - XML 1.0 name token (NMTOKEN)
xsd:NMTOKENS - List of XML 1.0 name tokens (NMTOKEN)
xsd:nonNegativeInteger - Integers of arbitrary length positive or equal to zero
xsd:nonPositiveInteger - Integers of arbitrary length negative or equal to zero
xsd:normalizedString - Whitespace-replaced strings
xsd:NOTATION - Emulation of the XML 1.0 feature
xsd:positiveInteger - Strictly positive integers of arbitrary length
xsd:QName - Namespaces in XML-qualified names
xsd:short - 32-bit signed integers
xsd:string - Any string
xsd:time - Point in time recurring each day
xsd:token - Whitespace-replaced and collapsed strings
xsd:unsignedByte - Unsigned value of 8 bits
xsd:unsignedInt - Unsigned integer of 32 bits
xsd:unsignedLong - Unsigned integer of 64 bits
xsd:unsignedShort - Unsigned integer of 16 bits
III. Appendixes
A. DSDL
A.1. A Multipart Standard
A.1.1. Part 1: Overview
A.1.2. Part 2: Regular Grammar-Based Validation
A.1.3. Part 3: Rule-Based Validation
A.1.4. Part 4: Selection of Validation Candidates
A.1.5. Part 5: Datatypes
A.1.6. Part 6: Path-Based Integrity Constraints
A.1.7. Part 7: Character Repertoire Validation
A.1.8. Part 8: Declarative Document Architectures
A.1.9. Part 9: Namespace- and Datatype-Aware DTDs
A.1.10. Part 10: Validation Management
A.2. What DSDL Should Bring You
B. The GNU Free Documentation License
GNU Free Documentation License
0. Preamble
1. APPLICABILITY AND DEFINITIONS
2. VERBATIM COPYING
3. COPYING IN QUANTITY
4. MODIFICATIONS
5. COMBINING DOCUMENTS
6. COLLECTIONS OF DOCUMENTS
7. AGGREGATION WITH INDEPENDENT WORKS
8. TRANSLATION
9. TERMINATION
10. FUTURE REVISIONS OF THIS LICENSE
Addendum: How to use this License for your documents
Glossary

List of Figures

2-1. A complete example of the book element
2-2. The blocks of the book element, seen from a W3C XML Schema perspective
2-3. An alternate approach to the document structure, made possible with RELAX NG
3-1. Comparing the Russian doll schema structure with that of the instance document
4-1. Comparing the RELAX NG XML syntax with its smaller compact syntax counterpart
5-1. Two different definitions of name in the same schema
5-2. Groups of identical attributes on different element types
5-3. Bizarre combinations of child content for a group
11-1. A mix of elements in different namespaces
12-1. A flat schema, which is difficult to extend
12-2. A split schema, which is easier to extend
12-3. An instance document with interleaved content
12-4. A document with interleaved container
14-1. The XHTML documentation for the RELAX NG XML syntax
14-2. The XHTML documentation for the RELAX NG compact syntax
14-3. A RDDL document produced using literate programming
14-4. Overlaps between XML, UML, and W3C XML schema
14-5. Overlaps between UML and RELAX NG
14-6. A UML model for the library
14-7. The library document structure, described in a spreadsheet

List of Tables

9-1. Special characters
9-2. Unicode character classes
9-3. Unicode character blocks

List of Examples

3-1. Sample instance document
4-1. Compact syntax of full RELAX NG schema
<safarimeta> <isbn>0-596-00421-4</isbn> <edition>First</edition> <authorgroup>Eric van der Vlist</authorgroup> <pagenums>506</pagenums> <pubdate>December 2003</pubdate> <publisher><publishername>O'Reilly</publishername><imprintname>O'Reilly</imprintname></publisher> <description-short>

RELAX NG is a grammar-based schema language that's both easy to learn for schema creators and easy to implement for software developers In RELAX NG, developers are introduced to this unique language and will learn a no-nonsense method for creating XML schemas. This book offers a clear-cut explanation of RELAX NG that enables intermediate and advanced XML developers to focus on XML document structures and content rather than battle the intricacies of yet another convoluted standard.

</description-short>
<msrp>29.95</msrp> <points>1</points>
<bibliomisc></bibliomisc> <bibliomisc></bibliomisc> <bibliomisc></bibliomisc> <bibliomisc></bibliomisc> <bibliomisc></bibliomisc> <bibliomisc></bibliomisc> <relateditems> <relation>Java & XML, 2nd Edition</relation> <relation>XSLT Cookbook</relation> <relation>Learning XML, 2nd Edition</relation> <relation>XML in a Nutshell, 2nd Edition</relation> </relateditems> <offlinecontentlist><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit><offlineunit></offlineunit></offlinecontentlist> </safarimeta>

This text is released under the Free Software Foundation GFDL.