by Eric van der Vlist is published by O'Reilly & Associates (ISBN: 0596004214)


Quantifying

Despite similarities on the surface, the pattern facet interprets its value in a very different way than value does. value reads the value as a lexical representation and converts it to the corresponding value for its base datatype, while the pattern facet reads the value as a set of conditions to apply on lexical values. When you write:

 pattern="15"

you specify three conditions (first character equals 1, second character equals 5, and the string must finish after the 5). Each of the matching conditions (such as first character equals 1, and second character equals 5) is called a piece. This is just the simplest form for specifying pieces.

Each piece in a pattern facet is composed of an atom identifying a character, or a set of characters, and an optional quantifier. Characters (except special characters, which must be escaped) are the simplest form of atoms. In the example, I have omitted the quantifiers. Quantifiers may be defined using two different syntaxes: using either special characters (* for 0 or more, + for one or more, and ? for 0 or 1) or a numeric range within curly braces ({n} for exactly n times, {n,m} for between n and m times, or {n,} for n or more times).

Using these quantifiers, you can merge the three pattern facets into one:

<data type="byte">
  <param name="pattern">1?5?</param>
</data>

or:

xsd:byte {pattern = "1?5?"}

This new pattern facet means that there must be zero or one character (1) followed by zero or one character (5). This is not exactly the same meaning as the three previous pattern facets because the empty string "" is now accepted by the pattern facet. However, because the empty string doesn't belong to the lexical space of the base type (xsd:byte), the new datatype has the same lexical space as the previous one.

You can also use quantifiers to limit the number of leading zeros; for instance, the following pattern facet limits the number of leading zeros to up to 2:

<data type="byte">
  <param name="pattern">0{0,2}1?5?</param>
</data>

or:

xsd:byte {pattern = "0{0,2}1?5?"}

This text is released under the Free Software Foundation GFDL.