HTML Standard
13
The HTML syntax
13.1
Writing HTML documents
13.1.1
The DOCTYPE
13.1.2
Elements
13.1.2.1
Start tags
13.1.2.2
End tags
13.1.2.3
Attributes
13.1.2.4
Optional tags
13.1.2.5
Restrictions on content models
13.1.2.6
Restrictions on the contents of raw text and escapable raw text elements
13.1.3
Text
13.1.3.1
Newlines
13.1.4
Character references
13.1.5
CDATA sections
13.1.6
Comments
13
The HTML syntax
This section only describes the rules for resources labeled with an
HTML
MIME type
. Rules for XML resources are discussed in the section below entitled "
The
XML syntax
".
13.1
Writing HTML documents
This section only applies to documents, authoring tools, and markup generators. In
particular, it does not apply to conformance checkers; conformance checkers must use the
requirements given in the next section ("parsing HTML documents").
Documents must consist of the following parts, in the given
order:
Optionally, a single U+FEFF BYTE ORDER MARK (BOM) character.
Any number of
comments
and
ASCII
whitespace
DOCTYPE
Any number of
comments
and
ASCII
whitespace
The
document element
, in the form of an
html
element
Any number of
comments
and
ASCII
whitespace
The various types of content mentioned above are described in the next few sections.
In addition, there are some restrictions on how
character encoding declarations
are to be serialized, as discussed in the
section on that topic.
ASCII whitespace
before the
html
element, at the start of the
html
element and before the
head
element, will be dropped when the
document is parsed;
ASCII whitespace
after
the
html
element
will be parsed as if it were at the end of the
body
element. Thus,
ASCII
whitespace
around the
document element
does not round-trip.
It is suggested that newlines be inserted after the DOCTYPE, after any comments that are
before the document element, after the
html
element's start tag (if it is not
omitted
), and after any comments that are inside the
html
element but before the
head
element.
Many strings in the HTML syntax (e.g. the names of elements and their attributes) are
case-insensitive, but only for
ASCII upper alphas
and
ASCII lower alphas
. For convenience, in this section this
is just referred to as "case-insensitive".
13.1.1
The DOCTYPE
DOCTYPE
is a
required preamble.
DOCTYPEs are required for legacy reasons. When omitted, browsers tend to use a
different rendering mode that is incompatible with some specifications. Including the DOCTYPE in a
document ensures that the browser makes a best-effort attempt at following the relevant
specifications.
A DOCTYPE must consist of the following components, in this order:
A string that is an
ASCII case-insensitive
match for the string "
".
One or more
ASCII whitespace
A string that is an
ASCII case-insensitive
match for the string "
html
".
Optionally, a
DOCTYPE legacy string
Zero or more
ASCII whitespace
A U+003E GREATER-THAN SIGN character (>).
In other words,
, case-insensitively.
For the purposes of HTML generators that cannot output HTML markup with the short DOCTYPE
", a
DOCTYPE legacy string
may be inserted
into the DOCTYPE (in the position defined above). This string must consist of:
One or more
ASCII whitespace
A string that is an
ASCII case-insensitive
match for the string "
SYSTEM
".
One or more
ASCII whitespace
A U+0022 QUOTATION MARK or U+0027 APOSTROPHE character (the
quote mark
).
The literal string "
about:legacy-compat
".
A matching U+0022 QUOTATION MARK or U+0027 APOSTROPHE character (i.e. the same character as in the earlier step labeled
quote mark
).
In other words,
or
, case-insensitively except for the
part in single or double quotes.
The
DOCTYPE legacy string
should not be used unless the document is generated from
a system that cannot output the shorter string.
13.1.2
Elements
There are six different kinds of
elements
void
elements
the
template
element
raw text
elements
escapable raw text elements
foreign elements
, and
normal elements
Void elements
area
base
br
col
embed
hr
img
input
link
meta
source
track
wbr
The
template
element
template
Raw text elements
script
style
Escapable raw text elements
textarea
title
Foreign elements
Elements from the
MathML namespace
and the
SVG namespace
Normal elements
All other allowed
HTML elements
are normal elements.
Tags
are used to delimit the start and end of elements in the
markup.
Raw text
escapable raw text
, and
normal
elements have
start tag
to indicate where they begin, and an
end tag
to indicate where they end. The start and end tags of
certain
normal elements
can be
omitted
, as
described below in the section on
optional tags
. Those
that cannot be omitted must not be omitted.
Void elements
only have a start tag; end
tags must not be specified for
void elements
Foreign elements
must
either have a start tag and an end tag, or a start tag that is marked as self-closing, in which
case they must not have an end tag.
The
contents
of the element must be placed between
just after the start tag (which
might be implied, in certain
cases
) and just before the end tag (which again,
might be
implied in certain cases
). The exact allowed contents of each individual element depend on
the
content model
of that element, as described earlier in
this specification. Elements must not contain content that their content model disallows. In
addition to the restrictions placed on the contents by those content models, however, the five
types of elements have additional
syntactic
requirements.
Void elements
can't have any contents (since there's no end tag, no content can be
put between the start tag and the end tag).
The
template
element
can have
template contents
, but such
template contents
are not children of the
template
element itself. Instead, they are stored in a
DocumentFragment
associated with a different
Document
— without a
browsing context
— so
as to avoid the
template
contents interfering with the main
Document
The markup for the
template contents
of a
template
element is placed
just after the
template
element's start tag and just before
template
element's end tag (as with other elements), and may consist of any
text
character references
elements
, and
comments
, but
the text must not contain the character U+003C LESS-THAN SIGN (<) or an
ambiguous ampersand
Raw text elements
can have
text
, though it has
restrictions
described below.
Escapable raw text elements
can have
text
and
character references
, but the text must not contain an
ambiguous ampersand
. There are also
further restrictions
described below.
Foreign elements
whose start tag is marked as self-closing can't have any contents
(since, again, as there's no end tag, no content can be put between the start tag and the end
tag).
Foreign elements
whose start tag is
not
marked as self-closing can
have
text
character
references
CDATA sections
, other
elements
, and
comments
, but
the text must not contain the character U+003C LESS-THAN SIGN (<) or an
ambiguous ampersand
The HTML syntax does not support namespace declarations, even in
foreign
elements
For instance, consider the following HTML fragment:
svg
metadata
cdr:license
xmlns:cdr
"https://www.example.com/cdr/metadata"
name
"MIT"
/>
metadata
svg
The innermost element,
cdr:license
, is actually in the SVG namespace, as
the "
xmlns:cdr
" attribute has no effect (unlike in XML). In fact, as the
comment in the fragment above says, the fragment is actually non-conforming. This is because
SVG 2
does not define any elements called "
cdr:license
" in
the SVG namespace.
Normal elements
can have
text
character references
, other
elements
, and
comments
, but
the text must not contain the character U+003C LESS-THAN SIGN (<) or an
ambiguous ampersand
. Some
normal elements
also have
yet more restrictions
on what content they are
allowed to hold, beyond the restrictions imposed by the content model and those described in this
paragraph. Those restrictions are described below.
Tags contain a
tag name
, giving the element's name. HTML
elements all have names that only use
ASCII
alphanumerics
. In the HTML syntax, tag names, even those for
foreign elements
may be written with any mix of lower- and uppercase letters that, when converted to all-lowercase,
matches the element's tag name; tag names are case-insensitive.
13.1.2.1
Start tags
Start tags
must have the following format:
The first character of a start tag must be a U+003C LESS-THAN SIGN character (<).
The next few characters of a start tag must be the element's
tag name
If there are to be any attributes in the next step, there must first be one or more
ASCII whitespace
Then, the start tag may have a number of attributes, the
syntax for which
is described below. Attributes must be
separated from each other by one or more
ASCII whitespace
After the attributes, or after the
tag name
if there
are no attributes, there may be one or more
ASCII whitespace
. (Some attributes are
required to be followed by a space. See the
attributes
section
below.)
Then, if the element is one of the
void elements
, or if the element is a
foreign element
, then there may be a single U+002F SOLIDUS
character (/), which on
foreign elements
marks the start tag as self-closing. On
void elements
, it does not mark the start tag as self-closing but instead is
unnecessary and has no effect of any kind. For such void elements, it should be used only with
caution — especially since, if directly preceded by an
unquoted attribute
value
, it becomes part of the attribute value rather than being discarded by the
parser.
Finally, start tags must be closed by a U+003E GREATER-THAN SIGN character (>).
13.1.2.2
End tags
End tags
must have the following format:
The first character of an end tag must be a U+003C LESS-THAN SIGN character (<).
The second character of an end tag must be a U+002F SOLIDUS character (/).
The next few characters of an end tag must be the element's
tag
name
After the tag name, there may be one or more
ASCII whitespace
Finally, end tags must be closed by a U+003E GREATER-THAN SIGN character (>).
13.1.2.3
Attributes
Attributes
for an element are expressed inside the
element's start tag.
Attributes have a name and a value.
Attribute names
must consist of one or more characters other than
controls
U+0020 SPACE, U+0022 ("), U+0027 ('), U+003E (>), U+002F (/), U+003D (=), and
noncharacters
. In the HTML syntax, attribute names, even those for
foreign elements
, may be written with any mix of
ASCII lower
and
ASCII upper alphas
Attribute values
are a mixture of
text
and
character references
except with the additional restriction that the text cannot contain an
ambiguous ampersand
Attributes can be specified in four different ways:
Empty attribute syntax
Just the
attribute name
. The value is implicitly
the empty string.
In the following example, the
disabled
attribute is
given with the empty attribute syntax:
input
disabled
If an attribute using the empty attribute syntax is to be followed by another attribute, then
there must be
ASCII whitespace
separating the two.
Unquoted attribute value syntax
The
attribute name
, followed by zero or more
ASCII whitespace
, followed by a single U+003D EQUALS SIGN character, followed by
zero or more
ASCII whitespace
, followed by the
attribute value
, which, in addition to the requirements
given above for attribute values, must not contain any literal
ASCII whitespace
any U+0022 QUOTATION MARK characters ("), U+0027 APOSTROPHE characters ('), U+003D
EQUALS SIGN characters (=), U+003C LESS-THAN SIGN characters (<), U+003E GREATER-THAN SIGN
characters (>), or U+0060 GRAVE ACCENT characters (`), and must not be the empty string.
In the following example, the
value
attribute is given
with the unquoted attribute value syntax:
input
value
yes
If an attribute using the unquoted attribute syntax is to be followed by another attribute or
by the optional U+002F SOLIDUS character (/) allowed in step 6 of the
start tag
syntax above, then there must be
ASCII
whitespace
separating the two.
Single-quoted attribute value syntax
The
attribute name
, followed by zero or more
ASCII whitespace
, followed by a single U+003D EQUALS SIGN character, followed by
zero or more
ASCII whitespace
, followed by a single U+0027 APOSTROPHE character
('), followed by the
attribute value
, which, in
addition to the requirements given above for attribute values, must not contain any literal
U+0027 APOSTROPHE characters ('), and finally followed by a second single U+0027 APOSTROPHE
character (').
In the following example, the
type
attribute is given
with the single-quoted attribute value syntax:
input
type
'checkbox'
If an attribute using the single-quoted attribute syntax is to be followed by another
attribute, then there must be
ASCII whitespace
separating the two.
Double-quoted attribute value syntax
The
attribute name
, followed by zero or more
ASCII whitespace
, followed by a single U+003D EQUALS SIGN character, followed by
zero or more
ASCII whitespace
, followed by a single U+0022 QUOTATION MARK character
("), followed by the
attribute value
, which, in
addition to the requirements given above for attribute values, must not contain any literal
U+0022 QUOTATION MARK characters ("), and finally followed by a second single U+0022 QUOTATION
MARK character (").
In the following example, the
name
attribute is given with
the double-quoted attribute value syntax:
input
name
"be evil"
If an attribute using the double-quoted attribute syntax is to be followed by another
attribute, then there must be
ASCII whitespace
separating the two.
There must never be two or more attributes on the same start tag whose names are an
ASCII
case-insensitive
match for each other.
When a
foreign element
has one of the namespaced
attributes given by the local name and namespace of the first and second cells of a row from the
following table, it must be written using the name given by the third cell from the same row.
Local name
Namespace
Attribute name
actuate
XLink namespace
xlink:actuate
arcrole
XLink namespace
xlink:arcrole
href
XLink namespace
xlink:href
role
XLink namespace
xlink:role
show
XLink namespace
xlink:show
title
XLink namespace
xlink:title
type
XLink namespace
xlink:type
lang
XML namespace
xml:lang
space
XML namespace
xml:space
xmlns
XMLNS namespace
xmlns
xlink
XMLNS namespace
xmlns:xlink
No other namespaced attribute can be expressed in
the HTML syntax
Whether the attributes in the table above are conforming or not is defined by
other specifications (e.g.
SVG 2
and
MathML
); this section only
describes the syntax rules if the attributes are serialized using the HTML syntax.
13.1.2.4
Optional tags
Certain tags can be
omitted
Omitting an element's
start tag
in the
situations described below does not mean the element is not present; it is implied, but it is
still there. For example, an HTML document always has a root
html
element, even if
the string
doesn't appear anywhere in the markup.
An
html
element's
start tag
may be omitted
if the first thing inside the
html
element is not a
comment
For example, in the following case it's ok to remove the "
tag:
html
head
title
Hello
title
head
body
Welcome to this example.
body
html
Doing so would make the document look like this:
head
title
Hello
title
head
body
Welcome to this example.
body
html
This has the exact same DOM. In particular, note that whitespace around the
document
element
is ignored by the parser. The following example would also have the exact same
DOM:
head
title
Hello
title
head
body
Welcome to this example.
body
html
However, in the following example, removing the start tag moves the comment to before the
html
element:
html
head
title
Hello
title
head
body
Welcome to this example.
body
html
With the tag removed, the document actually turns into the same as this:
html
head
title
Hello
title
head
body
Welcome to this example.
body
html
This is why the tag can only be removed if it is not followed by a comment: removing the tag
when there is a comment there changes the document's resulting parse tree. Of course, if the
position of the comment does not matter, then the tag can be omitted, as if the comment had been
moved to before the start tag in the first place.
An
html
element's
end tag
may be omitted if
the
html
element is not immediately followed by a
comment
head
element's
start tag
may be omitted if
the element is empty, or if the first thing inside the
head
element is an
element.
head
element's
end tag
may be omitted if
the
head
element is not immediately followed by
ASCII whitespace
or a
comment
body
element's
start tag
may be omitted
if the element is empty, or if the first thing inside the
body
element is not
ASCII whitespace
or a
comment
, except if the
first thing inside the
body
element is a
meta
noscript
link
script
style
, or
template
element.
body
element's
end tag
may be omitted if the
body
element is not immediately followed by a
comment
Note that in the example above, the
head
element start and end tags, and the
body
element start tag, can't be omitted, because they are surrounded by
whitespace:
html
head
title
Hello
title
head
body
Welcome to this example.
body
html
(The
body
and
html
element end tags could be omitted without
trouble; any spaces after those get parsed into the
body
element anyway.)
Usually, however, whitespace isn't an issue. If we first remove the whitespace we don't care
about:
html
><
head
><
title
Hello
title
>
head
><
body
><
Welcome to this example.
>
body
>
html
Then we can omit a number of tags without affecting the DOM:
title
Hello
title
><
Welcome to this example.
At that point, we can also add some whitespace back:
title
Hello
title
Welcome to this example.
This would be equivalent to this document, with the omitted tags shown in their
parser-implied positions; the only whitespace text node that results from this is the newline at
the end of the
head
element:
html
><
head
title
Hello
title
head
><
body
Welcome to this example.
body
>
html
An
li
element's
end tag
may be omitted if the
li
element is immediately followed by another
li
element or if there is
no more content in the parent element.
dt
element's
end tag
may be omitted if the
dt
element is immediately followed by another
dt
element or a
dd
element.
dd
element's
end tag
may be omitted if the
dd
element is immediately followed by another
dd
element or a
dt
element, or if there is no more content in the parent element.
element's
end tag
may be omitted if the
element is immediately followed by an
address
article
aside
blockquote
details
dialog
div
dl
fieldset
figcaption
figure
footer
form
h1
h2
h3
h4
h5
h6
header
hgroup
hr
main
nav
ol
pre
section
table
, or
ul
element, or if there is no more content in the parent
element and the parent element is an
HTML element
that is not
an
audio
del
ins
map
noscript
, or
video
element, or an
autonomous custom
element
We can thus simplify the earlier example further:
title
Hello
title
><
Welcome to this example.
An
rt
element's
end tag
may be omitted if the
rt
element is immediately followed by an
rt
or
rp
element,
or if there is no more content in the parent element.
An
rp
element's
end tag
may be omitted if the
rp
element is immediately followed by an
rt
or
rp
element,
or if there is no more content in the parent element.
An
optgroup
element's
end tag
may be omitted
if the
optgroup
element is
immediately followed by another
optgroup
element, if it is immediately followed by an
hr
element, or if there is no more content in the parent
element.
An
option
element's
end tag
may be omitted if
the
option
element is immediately followed by another
option
element, if
it is immediately followed by an
optgroup
element, if it is immediately followed by
an
hr
element, or if there is no more content in the parent element.
colgroup
element's
start tag
may be
omitted if the first thing inside the
colgroup
element is a
col
element,
and if the element is not immediately preceded by another
colgroup
element whose
end tag
has been omitted. (It can't be omitted if the element
is empty.)
colgroup
element's
end tag
may be omitted
if the
colgroup
element is not immediately followed by
ASCII whitespace
or a
comment
caption
element's
end tag
may be omitted if
the
caption
element is not immediately followed by
ASCII whitespace
or a
comment
thead
element's
end tag
may be omitted if
the
thead
element is immediately followed by a
tbody
or
tfoot
element.
tbody
element's
start tag
may be omitted
if the first thing inside the
tbody
element is a
tr
element, and if the
element is not immediately preceded by a
tbody
thead
, or
tfoot
element whose
end tag
has been omitted. (It
can't be omitted if the element is empty.)
tbody
element's
end tag
may be omitted if
the
tbody
element is immediately followed by a
tbody
or
tfoot
element, or if there is no more content in the parent element.
tfoot
element's
end tag
may be omitted if
there is no more content in the parent element.
tr
element's
end tag
may be omitted if the
tr
element is immediately followed by another
tr
element, or if there is
no more content in the parent element.
td
element's
end tag
may be omitted if the
td
element is immediately followed by a
td
or
th
element,
or if there is no more content in the parent element.
th
element's
end tag
may be omitted if the
th
element is immediately followed by a
td
or
th
element,
or if there is no more content in the parent element.
The ability to omit all these table-related tags makes table markup much terser.
Take this example:
table
caption
37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
caption
colgroup
><
col
><
col
><
col
>
colgroup
thead
tr
th
Function
th
th
Control Unit
th
th
Central Station
th
tr
thead
tbody
tr
td
Headlights
td
td
td
td
td
tr
tr
td
Interior Lights
td
td
td
td
td
tr
tr
td
Electric locomotive operating sounds
td
td
td
td
td
tr
tr
td
Engineer's cab lighting
td
td
>
td
td
td
tr
tr
td
Station Announcements - Swiss
td
td
>
td
td
td
tr
tbody
table
The exact same table, modulo some whitespace differences, could be marked up as follows:
table
caption
37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
colgroup
><
col
><
col
><
col
thead
tr
th
Function
th
Control Unit
th
Central Station
tbody
tr
td
Headlights
td
td
tr
td
Interior Lights
td
td
tr
td
Electric locomotive operating sounds
td
td
tr
td
Engineer's cab lighting
td
td
tr
td
Station Announcements - Swiss
td
td
table
Since the cells take up much less room this way, this can be made even terser by having each
row on one line:
table
caption
37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
colgroup
><
col
><
col
><
col
thead
tr
th
Function
th
Control Unit
th
Central Station
tbody
tr
td
Headlights
td
td
tr
td
Interior Lights
td
td
tr
td
Electric locomotive operating sounds
td
td
tr
td
Engineer's cab lighting
td
td
tr
td
Station Announcements - Swiss
td
td
table
The only differences between these tables, at the DOM level, is with the precise position of
the (in any case semantically-neutral) whitespace.
However
, a
start tag
must never be
omitted if it has any attributes.
Returning to the earlier example with all the whitespace removed and then all the optional
tags removed:
title
Hello
title
><
Welcome to this example.
If the
body
element in this example had to have a
class
attribute and the
html
element had to have a
lang
attribute, the markup would have to become:
html
lang
"en"
><
title
Hello
title
><
body
class
"demo"
><
Welcome to this example.
This section assumes that the document is conforming, in particular, that there
are no
content model
violations. Omitting tags in the fashion
described in this section in a document that does not conform to the
content models
described in this specification is likely to result in unexpected DOM differences (this is, in
part, what the content models are designed to avoid).
13.1.2.5
Restrictions on content models
For historical reasons, certain elements have extra restrictions beyond even the restrictions
given by their content model.
table
element must not contain
tr
elements, even though these
elements are technically allowed inside
table
elements according to the content
models described in this specification. (If a
tr
element is put inside a
table
in the markup, it will in fact imply a
tbody
start tag before
it.)
A single
newline
may be placed immediately after the
start tag
of
pre
and
textarea
elements.
This does not affect the processing of the element. The otherwise optional
newline
must
be included if the element's contents
themselves start with a
newline
(because otherwise the
leading newline in the contents would be treated like the optional newline, and ignored).
The following two
pre
blocks are equivalent:
pre
Hello
pre
pre
Hello
pre
13.1.2.6
Restrictions on the contents of raw text and escapable raw text elements
The text in
raw text
and
escapable raw text
elements
must not contain any occurrences of the string "
(U+003C LESS-THAN SIGN, U+002F SOLIDUS) followed by characters that case-insensitively match the
tag name of the element followed by one of U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED
(LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN
(>), or U+002F SOLIDUS (/).
13.1.3
Text
Text
is allowed inside elements, attribute values, and comments.
Extra constraints are placed on what is and what is not allowed in text based on where the text is
to be put, as described in the other sections.
13.1.3.1
Newlines
Newlines
in HTML may be represented either as U+000D
CARRIAGE RETURN (CR) characters, U+000A LINE FEED (LF) characters, or pairs of U+000D CARRIAGE
RETURN (CR), U+000A LINE FEED (LF) characters in that order.
Where
character references
are allowed, a character
reference of a U+000A LINE FEED (LF) character (but not a U+000D CARRIAGE RETURN (CR) character)
also represents a
newline
13.1.4
Character references
In certain cases described in other sections,
text
may be
mixed with
character references
. These can be used to escape
characters that couldn't otherwise legally be included in
text
Character references must start with a U+0026 AMPERSAND character (&). Following this,
there are three possible kinds of character references:
Named character references
The ampersand must be followed by one of the names given in the
named character
references
section, using the same case. The name must be one that is
terminated by a U+003B SEMICOLON character (;).
Decimal numeric character reference
The ampersand must be followed by a U+0023 NUMBER SIGN character (#), followed by one or more
ASCII digits
, representing a base-ten integer that corresponds to a code point that
is allowed according to the definition below. The digits must then be followed by a U+003B
SEMICOLON character (;).
Hexadecimal numeric character reference
The ampersand must be followed by a U+0023 NUMBER SIGN character (#), which must be followed
by either a U+0078 LATIN SMALL LETTER X character (x) or a U+0058 LATIN CAPITAL LETTER X
character (X), which must then be followed by one or more
ASCII hex digits
representing a hexadecimal integer that corresponds to a code point that is allowed according to
the definition below. The digits must then be followed by a U+003B SEMICOLON character (;).
The numeric character reference forms described above are allowed to reference any code point
excluding U+000D CR,
noncharacters
, and
controls
other than
ASCII whitespace
An
ambiguous ampersand
is a U+0026 AMPERSAND
character (&) that is followed by one or more
ASCII
alphanumerics
, followed by a U+003B SEMICOLON character (;), where these characters do not
match any of the names given in the
named character references
section.
13.1.5
CDATA sections
CDATA sections
must consist of the following components, in
this order:
The string "
".
Optionally,
text
, with the additional restriction that the
text must not contain the string "
]]>
".
The string "
]]>
".
CDATA sections can only be used in foreign content (MathML or SVG). In this example, a CDATA
section is used to escape the contents of a
MathML
ms
element:
You can add a string to a number, but this stringifies the number:
math
ms
ms
mo
mo
mn
mn
mo
mo
ms
ms
math
13.1.6
Comments
Comments
must have the following format:
The string "
", or "
--!>
", nor end with the string "
".
The string "
-->
".
The
text
is allowed to end with the string
", as in
US