SABLE is designed to function as a well-defined standard in which the same text will be handled consistently by multiple synthesizers. SABLE is also intended to function as a tool for research on speech synthesis and as a tool for innovation. As such, it is expected that research systems will support tags, attributes and attribute values not defined in the SABLE specification, and that SABLE text will be generated for specific systems which include those tags and attributes. Where such extensions prove useful and become generally supported, they can be proposed as an addition to the standard specification.
To clearly distinguish tags, attributes and attribute
values that are non-standard, they should include an ``X-'' prefix and
optionally an engine identifier.
A non-standard tag for providing an engine-specific pronunciation
string would look like:
<X-ME-PRON PHON="i" DUR="120"/>
where ME is ``My Engine'' and the X-ME-PRON element inserts an /i/
phoneme with a duration of 120 msec understood by ``My Engine''.
Here, because the PHON and DUR attributes are embedded in a
non-standard element, they are implicitly non-standard attributes. A
non-standard attribute of a standard tag would look as follows:
<PRON X-ME-PHONES="ka:t">cat<PRON>
or
<EMPH LEVEL="strong"
X-PITCHACCENT="H*+L">word</EMPH>
The first example provides the pronunciation for cat in a
format that is understood by ``My Engine''. Other synthesizers will
ignore the attribute. The second example includes both a standard
attribute -- LEVEL -- and a non-standard attribute --
X-PITCHACCENT. A system that understands the non-standard attribute
will apply the ``H*+L'' accent when producing string emphasis on
``word''. (The engine identifier need not be used, as in the
X-PITCHACCENT example, particularly for attributes that may be
recognized by more than one synthesizer.)
Finally, a non-standard attribute value might look like:
<DIV TYPE="x-dialog-close">...</DIV>
The ``x-dialog-close'' is a non-standard value of the standard TYPE
attribute which is currently specified as being either ``sentence'' or
``paragraph''. This non-standard value could indicate that the
contents of the element are the end of a dialog turn.
If an engine gets a non-standard tag, attribute or attribute value in its input text that it does not know, it simply ignores it. For example, in the X-ME-PHONES example, a synthesizer that ignores the tag will try to say the word cat. Wherever possible, non-standard tags and elements should be designed so that output is not substantially impacted if ignored.