RELAX NG Specification
RELAX NG Specification
Committee Specification�3 December 2001
This version:
Committee Specification: 3 December 2001
Previous versions:
Committee Specification: 11 August 2001
Editors:
James Clark�
[email protected]
, MURATA Makoto�
[email protected]
Copyright � The Organization for the Advancement of
Structured Information Standards [OASIS] 2001. All Rights
Reserved.
This document and translations of it may be copied and furnished
to others, and derivative works that comment on or otherwise explain
it or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any kind,
provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to OASIS, except as needed for the
purpose of developing OASIS specifications, in which case the
procedures for copyrights defined in the OASIS Intellectual Property
Rights document must be followed, or as required to translate it into
languages other than English.
The limited permissions granted above are perpetual and will not
be revoked by OASIS or its successors or assigns.
This document and the information contained herein is provided
on an "AS IS" basis and OASIS DISCLAIMS ALL WARRANTIES,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE
USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY
IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE.
Abstract
This is the definitive specification of RELAX NG, a simple
schema language for XML, based on
[RELAX]
and
[TREX]
. A RELAX NG schema specifies a pattern for the
structure and content of an XML document. A RELAX NG schema is itself
an XML document.
Status of this Document
This Committee Specification was approved for publication by the
OASIS RELAX NG technical committee. It is a stable document which
represents the consensus of the committee. Comments on this document
may be sent to
[email protected]
A list of known errors in this document is available at
Table of Contents
Introduction
Data model
2.1
Example
Full syntax
3.1
Example
Simplification
4.1
Annotations
4.2
Whitespace
4.3
datatypeLibrary
attribute
4.4
type
attribute of
value
element
4.5
href
attribute
4.6
externalRef
element
4.7
include
element
4.8
name
attribute of
element
and
attribute
elements
4.9
ns
attribute
4.10
QNames
4.11
div
element
4.12
Number of child elements
4.13
mixed
element
4.14
optional
element
4.15
zeroOrMore
element
4.16
Constraints
4.17
combine
attribute
4.18
grammar
element
4.19
define
and
ref
elements
4.20
notAllowed
element
4.21
empty
element
Simple syntax
5.1
Example
Semantics
6.1
Name classes
6.2
Patterns
6.2.1
choice
pattern
6.2.2
group
pattern
6.2.3
empty
pattern
6.2.4
text
pattern
6.2.5
oneOrMore
pattern
6.2.6
interleave
pattern
6.2.7
element
and
attribute
pattern
6.2.8
data
and
value
pattern
6.2.9
Built-in datatype library
6.2.10
list
pattern
6.3
Validity
6.4
Example
Restrictions
7.1
Contextual restrictions
7.1.1
attribute
pattern
7.1.2
oneOrMore
pattern
7.1.3
list
pattern
7.1.4
except
in
data
pattern
7.1.5
start
element
7.2
String sequences
7.3
Restrictions on attributes
7.4
Restrictions on
interleave
Conformance
Appendixes
RELAX NG schema for RELAX NG
Changes since version 0.9
RELAX NG TC (Non-Normative)
References
1. Introduction
This document specifies
when an XML document is a correct RELAX NG
schema
when an XML document is valid with respect to a
correct RELAX NG schema
An XML document that is being validated with respect to a RELAX NG
schema is referred to as an instance.
The structure of this document is as follows.
Section 2
describes the data model, which is the
abstraction of an XML document used throughout the rest of the
document.
Section 3
describes the syntax of a
RELAX NG schema; any correct RELAX NG schema must conform to this
syntax.
Section 4
describes a sequence of
transformations that are applied to simplify a RELAX NG schema;
applying the transformations also involves checking certain
restrictions that must be satisfied by a correct RELAX NG
schema.
Section 5
describes the syntax that
results from applying the transformations; this simple syntax is a
subset of the full syntax.
Section 6
describes the
semantics of a correct RELAX NG schema that uses the simple syntax;
the semantics specify when an element is valid with respect to a RELAX
NG schema.
Section 7
describes restrictions in
terms of the simple syntax; a correct RELAX NG schema must be such
that, after transformation into the simple form, it satisfies these
restrictions. Finally,
Section 8
describes
conformance requirements for RELAX NG validators.
A tutorial is available separately (see
[Tutorial]
).
2. Data model
RELAX NG deals with XML documents representing both schemas and
instances through an abstract data model. XML documents representing
schemas and instances must be well-formed in conformance with
[XML 1.0]
and must conform to the constraints of
[XML Namespaces]
An XML document is represented by an element. An element consists
of
a name
a context
a set of attributes
an ordered sequence of zero or more children; each
child is either an element or a non-empty string; the sequence never contains
two consecutive strings
A name consists of
a string representing the namespace URI; the empty
string has special significance, representing the absence of any
namespace
a string representing the local name; this string matches the NCName
production of
[XML Namespaces]
A context consists of
a base URI
a namespace map; this maps prefixes to namespace URIs,
and also may specify a default namespace URI (as declared
by the
xmlns
attribute)
An attribute consists of
a name
a string representing the value
A string consists of a sequence of zero or more characters,
where a character is as defined in
[XML 1.0]
The element for an XML document is constructed from an instance
of the
[XML Infoset]
as follows. We use the notation
] to refer to the value of the
property of an information item. An
element is constructed from a document information item by
constructing an element from the [document element]. An element is
constructed from an element information item by constructing the name
from the [namespace name] and [local name], the context from the [base
URI] and [in-scope namespaces], the attributes from the [attributes],
and the children from the [children]. The attributes of an element
are constructed from the unordered set of attribute information items
by constructing an attribute for each attribute information item. The
children of an element are constructed from the list of child
information items first by removing information items other than
element information items and character information items, and then by
constructing an element for each element information item in the list
and a string for each maximal sequence of character information items.
An attribute is constructed from an attribute information item by
constructing the name from the [namespace name] and [local name], and
the value from the [normalized value]. When constructing the name of
an element or attribute from the [namespace name] and [local name], if
the [namespace name] property is not present, then the name is
constructed from an empty string and the [local name]. A string is
constructed from a sequence of character information items by
constructing a character from the [character code] of each character
information item.
It is possible for there to be multiple distinct infosets for a
single XML document. This is because XML parsers are not required to
process all DTD declarations or expand all external parsed general
entities. Amongst these multiple infosets, there is exactly one
infoset for which [all declarations processed] is true and which does
not contain any unexpanded entity reference information items. This
is the infoset that is the basis for defining the RELAX NG data
model.
2.1. Example
Suppose the document
is as
follows:

xmlns:pre2="http://www.example.com/n2"/>
The element representing this document has
a name which has
the empty string as the namespace URI, representing
the absence of any namespace
foo
as the local
name
a context which has
as the base
URI
a namespace map which
maps the prefix
xml
to the
namespace URI
(the
xml
prefix is implicitly declared
by every XML document)
specifies the empty string as the default namespace
URI
an empty set of attributes
a sequence of children consisting
of an element which has
a name which has
as the
namespace URI
bar1
as the local
name
a context which has
as the base
URI
a namespace map which
maps the prefix
pre1
to the
namespace URI
maps the prefix
xml
to the
namespace URI
specifies the empty string as the default namespace
URI
an empty set of attributes
an empty sequence of children
followed by an element which has
a name which has
as the
namespace URI
bar2
as the local
name
a context which has
as the base
URI
a namespace map which
maps the prefix
pre2
to the
namespace URI
maps the prefix
xml
to the
namespace URI
specifies the empty string as the default namespace
URI
an empty set of attributes
an empty sequence of children
3. Full syntax
The following grammar summarizes the syntax of RELAX NG.
Although we use a notation based on the XML representation of an RELAX
NG schema as a sequence of characters, the grammar must be understood
as operating at the data model level. For example, although the
syntax uses

, an instance or
schema can use

instead,
because they both represent the same element at the data model level.
All elements shown in the grammar are qualified with the namespace
URI:
The symbols QName and NCName are defined in
[XML Namespaces]
. The anyURI symbol has the same meaning as the
anyURI datatype of
[W3C XML Schema Datatypes]
: it indicates a
string that, after escaping of disallowed values as described in
Section 5.4 of
[XLink]
, is a URI reference as defined
in
[RFC 2396]
(as modified by
[RFC 2732]
). The symbol string matches any string.
In addition to the attributes shown explicitly, any element can
have an
ns
attribute and any element can have a
datatypeLibrary
attribute. The
ns
attribute can have any value. The value of the
datatypeLibrary
attribute must match the anyURI
symbol as described in the previous paragraph; in addition, it must
not use the relative form of URI reference and must not have a
fragment identifier; as an exception to this, the value may be the
empty string.
Any element can also have foreign attributes in addition to the
attributes shown in the grammar. A foreign attribute is an attribute
with a name whose namespace URI is neither the empty string nor the
RELAX NG namespace URI. Any element that cannot have string children
(that is, any element other than
value
param
and
name
) may have foreign child elements in addition
to the child elements shown in the grammar. A foreign element is an
element with a name whose namespace URI is not the RELAX NG namespace
URI. There are no constraints on the relative position of foreign
child elements with respect to other child elements.
Any element can also have as children strings that consist
entirely of whitespace characters, where a whitespace character is one
of #x20, #x9, #xD or #xA. There are no constraints on the relative
position of whitespace string children with respect to child
elements.
Leading and trailing whitespace is allowed for value of each
name
type
and
combine
attribute and for the content of each
name
element.
pattern
��::=��
name="
QName
pattern

nameClass
pattern

name="
QName
pattern

nameClass
pattern

pattern

pattern

pattern

pattern

pattern

pattern

pattern

pattern

name="
NCName
/>
name="
NCName
/>
/>
/>
type="
NCName
string

type="
NCName
param
* [
exceptPattern

/>
href="
anyURI
/>
grammarContent

param
��::=��
name="
NCName
string

exceptPattern
��::=��
pattern

grammarContent
��::=��
start
define
grammarContent