Speaker Directives

Next: Text Description Up: TAGS AND ATTRIBUTES Previous: TAGS AND ATTRIBUTES

Speaker Directives

EMPH (container element): set the emphasis of the contained text.
- LEVEL (numeric, descriptive)
BREAK (empty element): sets an intrasentential, prosodic break at current position.
- LEVEL (numeric, descriptive)
- MSEC (numeric)
- TYPE (descriptive): a punctuation symbol that represents (roughly) the kind of intonation contour to be associated with the material preceding the break (e.g. `?' to mark ``question'' intonation).
PITCH (container element): sets properties associated with pitch of the enclosed region.
- BASE (numeric, descriptive)
- MIDDLE (numeric, descriptive)
- RANGE (numeric, descriptive)
RATE (container element): sets the average speech rate of the enclosed region.
- SPEED (numeric, descriptive)
VOLUME (container element): sets the amplitude of the enclosed region in terms of the available range of the engine.
- LEVEL (numeric, descriptive)
AUDIO (empty): load and play an audio URL starting at the given point.
- SRC: URL of audio document
- MODE: specifies whether to play in background or not
- LEVEL: level of audio relative to surrounding speech
PRON (container): substitute the specified pronunciation for what would normally correspond to the contained text.
- IPA: character string in Unicode IPA (International Phonetic Alphabet)
- SUB: attempt at ``phonetic'' spelling in the language of the enclosing text
- ORIGIN: ISO639 identifier for the language of origin of the enclosed text
LANGUAGE (container): specifies the language of the contained text.
- ID: ISO639 identifier for the language
SPEAKER (container): defines properties of the speaker speaking the contained text
- GENDER
- AGE (descriptive)
- NAME: ``name'' of a speaker if a particular engine is being used

As a sample of the use of some of these tags, consider the following example from a hypothetical e-mail reader that uses SABLE markup. Since e-mail readers have access to information about at least some structural aspects of the input -- e.g. header information about the sender, subject and date, this information can be used to control the synthesizer's behavior in useful ways.:

<DIV TYPE="paragraph">New e-mail from
<EMPH>Tom Jones</EMPH>
regarding <PITCH BASE="high" RANGE="large">
<RATE SPEED="-20%">latest album</RATE>
</PITCH>.</DIV>
<AUDIO SRC="beep.aiff"/>

The subject information (``latest album'') is highlighted auditorily by setting a higher base pitch and larger pitch range, and by slowing down the speech by 20%. Finally, the header is terminated by an audible beep (``beep.aiff'').

Next: Text Description Up: TAGS AND ATTRIBUTES Previous: TAGS AND ATTRIBUTES

Richard Sproat
1998-11-16