Aural style sheets
previous
next
contents
properties
index
Appendix A. Aural style sheets
Contents
A.1 The media types 'aural' and 'speech'
A.2 Introduction to aural style sheets
A.2.1 Angles
A.2.2 Times
A.2.3 Frequencies
A.3 Volume properties:
'volume'
A.4 Speaking properties:
'speak'
A.5 Pause properties:
'pause-before'
'pause-after'
, and
'pause'
A.6 Cue properties:
'cue-before'
'cue-after'
, and
'cue'
A.7 Mixing properties:
'play-during'
A.8 Spatial properties:
'azimuth'
and
'elevation'
A.9 Voice characteristic properties:
'speech-rate'
'voice-family'
'pitch'
'pitch-range'
'stress'
, and
'richness'
A.10 Speech properties:
'speak-punctuation'
and
'speak-numeral'
A.11 Audio rendering of tables
A.11.1 Speaking headers: the
'speak-header'
property
A.12 Sample style sheet for HTML
A.13 Emacspeak
(hide)
Note:
Several sections of this specification have been updated by other specifications. Please, see
"Cascading Style Sheets (CSS) — The Official Definition"
in the latest
CSS Snapshot
for a list of specifications and the sections they replace.
The CSS Working Group is also developing
CSS level 2 revision 2 (CSS 2.2).
This chapter is informative. UAs are not required to implement the
properties of this chapter in order to conform to CSS 2.1.
A.1
The media types 'aural' and 'speech'
We expect that in a future level of CSS there will be new
properties and values defined for speech output. Therefore
CSS 2.1 reserves the 'speech' media type (see
chapter 7, "Media types"
), but does not yet
define which properties do or do not apply to it.
The properties in this appendix apply to a media type 'aural', that
was introduced in CSS2. The type 'aural' is now deprecated.
This means that a style sheet such as
@media speech {
body { voice-family: Paul }
is valid, but that its meaning is not defined by CSS 2.1,
while
@media aural {
body { voice-family: Paul }
is deprecated, but defined by this appendix.
A.2
Introduction to aural style sheets
The aural rendering of a document, already commonly used by the
blind and print-impaired communities, combines speech synthesis and
"auditory icons."
Often
such aural presentation occurs by converting the document to plain
text and feeding this to a
screen reader
-- software or hardware that
simply reads all the characters on the screen. This results in less
effective presentation than would be the case if the document
structure were retained. Style sheet properties for aural presentation
may be used together with visual properties (mixed media) or as an
aural alternative to visual presentation.
Besides the obvious accessibility advantages, there are other large
markets for listening to information, including in-car use, industrial
and medical documentation systems (intranets), home entertainment, and
to help users learning to read or who have difficulty reading.
When using aural properties, the
canvas
consists of a three-dimensional physical
space (sound surrounds) and a temporal space (one may specify sounds
before, during, and after other sounds). The CSS properties also
allow authors to vary the quality of synthesized speech (voice type,
frequency, inflection, etc.).
Example(s):
h1, h2, h3, h4, h5, h6 {
voice-family: paul;
stress: 20;
richness: 90;
cue-before: url("ping.au")
p.heidi { azimuth: center-left }
p.peter { azimuth: right }
p.goat { volume: x-soft }
This will direct the speech synthesizer to speak headers in a voice
(a kind of "audio font") called "paul", on a flat tone, but in a very
rich voice. Before speaking the headers, a sound sample will be played
from the given URL. Paragraphs with class "heidi" will appear to come
from front left (if the sound system is capable of spatial audio), and
paragraphs of class "peter" from the right. Paragraphs with class
"goat" will be very soft.
A.2.1
Angles
Angle values are denoted by

in the text.
Their format is a

immediately
followed by an angle unit identifier.
Angle unit identifiers are:
deg
: degrees
grad
: grads
rad
: radians
Angle values may be negative. They should be normalized to the
range 0-360deg by the user agent. For example, -10deg and 350deg are
equivalent.
For example, a right angle is '90deg' or '100grad' or
'1.570796326794897rad'.
Like for , the unit may be omitted, if the value is
zero: '0deg' may be written as '0'.
A.2.2
Times
Time values are denoted by

































































Travel Expense Report
MealsHotelsTransportsubtotal
San Jose
25-Aug-9737.74112.0045.00
26-Aug-9727.28112.0045.00
subtotal65.02224.0090.00379.02
Seattle
27-Aug-9796.25109.0036.00
28-Aug-9735.00109.0036.00
subtotal131.25218.0072.00421.25
Totals196.27442.00162.00800.27

By providing the data model in this way, authors make it
possible for speech enabled-browsers to explore the table in
rich ways, e.g., each cell could be spoken as a list, repeating the
applicable headers before each data cell:
San Jose, 25-Aug-97, Meals: 37.74
San Jose, 25-Aug-97, Hotels: 112.00
San Jose, 25-Aug-97, Transport: 45.00
...
The browser could also speak the headers only when they change:
San Jose, 25-Aug-97, Meals: 37.74
Hotels: 112.00
Transport: 45.00
26-Aug-97, Meals: 27.28
Hotels: 112.00
...
A.12
Sample style sheet for HTML
This style sheet describes a possible rendering of HTML 4:
@media aural {
h1, h2, h3,
h4, h5, h6 { voice-family: paul, male; stress: 20; richness: 90 }
h1 { pitch: x-low; pitch-range: 90 }
h2 { pitch: x-low; pitch-range: 80 }
h3 { pitch: low; pitch-range: 70 }
h4 { pitch: medium; pitch-range: 60 }
h5 { pitch: medium; pitch-range: 50 }
h6 { pitch: medium; pitch-range: 40 }
li, dt, dd { pitch: medium; richness: 60 }
dt { stress: 80 }
pre, code, tt { pitch: medium; pitch-range: 0; stress: 0; richness: 80 }
em { pitch: medium; pitch-range: 60; stress: 60; richness: 50 }
strong { pitch: medium; pitch-range: 60; stress: 90; richness: 90 }
dfn { pitch: high; pitch-range: 60; stress: 60 }
s, strike { richness: 0 }
i { pitch: medium; pitch-range: 60; stress: 60; richness: 50 }
b { pitch: medium; pitch-range: 60; stress: 90; richness: 90 }
u { richness: 0 }
a:link { voice-family: harry, male }
a:visited { voice-family: betty, female }
a:active { voice-family: betty, female; pitch-range: 80; pitch: x-high }
A.13
Emacspeak
For information, here is the list of properties implemented by
Emacspeak, a speech subsystem for the Emacs editor.
voice-family
stress (but with a different range of values)
richness (but with a different range of values)
pitch (but with differently named values)
pitch-range (but with a different range of values)
(We thank T. V. Raman for the information about implementation
status of aural properties.)
previous
next
contents
properties
index