This is a compendium of some of the things I have worked on, with some associated publications. Much of this is quite old though. For a more up-to-date list of what I'm currently involved in, see my home page.
My interests range from the theoretical to the applied. At one time or another I have done work in all of the following areas. Here I list only those areas where I have actually published things, along with a few sample publications in the area. A fuller list of publications can be found on the publications page.
Mark Liberman and Richard Sproat, "The Stress and Structure of
Modified Noun Phrases in English," in I. Sag (ed.), Lexical
Matters, 131-181, CSLI Publications, Chicago, University of
Chicago Press, 1992.
Richard Sproat, "English Noun-Phrase Accent Prediction for
Text-to-Speech." Computer Speech and Language, 8, 79-94,
1994.
Richard Sproat and Osamu Fujimura, "Allophonic variation in English
/l/ and its implications for phonetic implementation," Journal of
Phonetics, 21, 291-311, 1993.
One thing I'd like to do, given enough time, is return to this work
and do a complete survey for all consonants for a language. While
there have been detailed articulatory studies of many classes of
consonants (e.g. Krakow's work on nasals), to my knowledge nobody has
done a complete survey of the prosodic and syllable affiliation
affects on the complete range of consonants for English, or any other language.
Chilin Shih and Richard Sproat, "Mandarin Text-to-Speech Synthesis,"
Proceedings of the Eighth North American Conference on Chinese
Linguistics, 1997.
Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin
Nominal Root Compounds." Journal of East Asian Linguistics,
5, 49-71, 1996.
Richard Sproat and Chilin Shih, "Why Mandarin Morphology is not
Stratum-Ordered." Yearbook of Morphology, 185-217, 1993.
Richard Sproat and Chilin Shih, "The Cross-Linguistic Distribution of
Adjective Ordering Restrictions," in C. Georgopoulos and R. Ishihara
(eds.), Interdisciplinary Approaches to Language: Essays in Honor
of S.-Y. Kuroda, 565-593, Dordrecht, Kluwer Academic
Publishers, 1990.
Richard Sproat, Morphology
and Computation, Cambridge, MA, MIT Press, 1992.
I have done various pieces of work on computational morphology over
the years. Note that a lot of the work that I did on Text Analysis for
TTS for languages like Russian involved a non-trivial amount of morphology.
Richard Sproat and Barbara Brunson, "Constituent-Based Morphological
Parsing: A New Approach to the Problem of Word-Recognition," 25th
Annual Meeting of the Association for Computational Linguistics:
Proceedings of the Conference, 65-72, 1987.
Elena Pavlova, Chilin Shih and Richard Sproat, "A Text-to-Speech
System for Russian," Proceedings of EUROSPEECH, 1997.
Richard Sproat,
"Inferring the Environment in a Text-to-Scene Conversion System",
First International Conference on Knowledge Capture (K-CAP
'01), Victoria, BC, Canada, 2001.
Harald Baayen and Richard Sproat,
"Estimating Lexical Priors for
Low-Frequency Morphologically Ambiguous Forms," Computational
Linguistics, 22(2), 1996.
Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin
Nominal Root Compounds." Journal of East Asian Linguistics,
5, 49-71, 1996.
Richard Sproat and Chilin Shih, "A Statistical Method for Finding Word
Boundaries in Chinese Text," Computer Processing of Chinese and
Oriental Languages, 4, 336-351, 1990.
Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin
Nominal Root Compounds." Journal of East Asian Linguistics,
5, 49-71, 1996.
Richard Sproat and Chilin Shih, "Why Mandarin Morphology is not
Stratum-Ordered." Yearbook of Morphology, 185-217, 1993.
Gregory Ward, Richard Sproat and Gail McKoon, "A Pragmatic Analysis of
So-Called Anaphoric Islands," Language, 67, 439-474, 1991.
Richard Sproat, On Deriving the Lexicon. MIT Working Papers
in Linguistics, Cambridge, MA. 1985. (Published version of
Ph.D. dissertation)
Richard Sproat,
"Pmtools: A Pronunciation Modeling Toolkit",
Proceedings of the Fourth ISCA Tutorial and Research Workshop on
Speech Synthesis, Blair Atholl, Scotland, 2001.
Richard Sproat, "Welsh Syntax and VSO Structure," Natural Language
and Linguistic Theory, 3, 173-216, 1985.
In 1999 I headed a project on Text Normalization at the Johns Hopkins
University WS99. While the
project was about all aspects of Text Normalization (standard
abbreviation expansion, number expansion, currency amounts, etc.), my
own particular interest was in how one might infer the expansion of
non-standard abbreviations given an unannotated corpus, where
the correct answer is to be found somewhere in the corpus. For
example, in a corpus of real estate ads, one might find
livrm, and infer that the correct expansion is living
room based on the occurrence of that phrase somewhere else in the
corpus. Various papers related to issues in text analysis are:
Richard Sproat, Alan Black, Stanley Chen, Shankar Kumar, Mari
Ostendorf, and Christopher Richards. "Normalization of non-standard
words." Computer Speech and Language, 15(3), 287-333, 2001.
Richard Sproat. "Multilingual Text Analysis for Text-to-Speech
Synthesis", Natural Language Engineering, 2(4), 369-380,
1997.
Richard Sproat, Chilin Shih, William Gale and Nancy Chang, "A Stochastic
Finite-State Word-Segmentation Algorithm for Chinese,"
Computational Linguistics, 22(3), 1996.
Mehryar Mohri and Richard Sproat. "An Efficient Compiler
for Weighted Rewrite Rules", 34th Annual Meeting of the
Association for Computational Linguistics: Proceedings of the
Conference, 1996.
Some tools that we have produced that are useful for text normalization are Thrax, a finite-state
grammar development library and Sparrowhawk a lightweight
open-source version of the Google Kestrel text normalization system.
More recently my team and I have been working on neural models of text
normalization. See
Richard Sproat and Navdeep Jaitly "An RNN Model of Text Normalization,"
Proceedings of Interspeech, 2017.
Richard Sproat,
"Inferring the Environment in a Text-to-Scene Conversion System",
First International Conference on Knowledge Capture (K-CAP
'01), Victoria, BC, Canada, 2001.
Bob Coyne and Richard Sproat,
"WordsEye: An Automatic Text-to-Scene Conversion system",
SIGGRAPH 2001, Los Angeles, CA, 2001.
Future plans for this work are to continue working on it -- somehow.
Apart from trying to commercialize this technology, there are also
various scientific questions to explore: WordsEye makes for a
wonderful testbed for various ideas about language and meaning. One
idea that I want to develop is the notion that the basic building
blocks of WordsEye -- poses for actions, and so forth -- can be used
to build up a large-scale graphics-based lexical semantic
database. This idea is, of course, related to the use of pictorial
representations in Cognitive Grammar, and it is also related to the
work at U Penn on the Actionary.
Richard Sproat and Joseph Olive, "A Modular Architecture for
Multilingual Text-to-Speech," Proceedings of The Second ESCA/IEEE
Workshop on Speech Synthesis, 187-190, 1994.
Richard Sproat (editor),
Multilingual Text-to-Speech Synthesis: The Bell Labs
Approach, Boston, MA, Kluwer Academic Publishers, 1997.
I have been involved in markup standards for TTS, including being
one of the developers of
SABLE:
Richard Sproat, Andrew Hunt, Mari Ostendorf, Paul Taylor, Alan Black,
Kevin Lenzo, Mike Edgington, "SABLE: A
Standard for TTS Markup", International Conference on Spoken
Language Processing, 1998. Also presented at the ESCA/COCOSDA
Speech Synthesis Workshop. Jenolan Caves, Australia; and at the W3C
Meeting on Voice Browsers, Cambridge, MA, October 13, 1998.
Richard Sproat, T. V. Raman.
"SABLE: an XML-based Aural Display List For The WWW". 1999.
I was a participant in the W3C's
Voice Browser Working Group.
Mari Ostendorf, Andrew Hunt and I organized the two-day NSF
Workshop for Discussing Research Priorities and Evaluation Strategies
in Speech Synthesis, held at the National Science Foundation,
Arlington, VA, August 6-7, 1998. Here is the final
report from that workshop.
I am co-PI on NSF ITR Award #0205731
Prosody Generation for Child Oriented Speech Synthesis
Richard Sproat,
A Computational Theory of Writing Systems, (ACL Studies
in Natural Language Processing Series), Cambridge, Cambridge
University Press, 2000. Recently I've been doing some work on rongorongo, the Easter Island script,
using approximate string matching techniques to discover parallel
texts in the rongorongo corpus.
In recent work, I have explored the layout of South Asian scripts and
its relation to psycholinguistic work on phonological awareness:
Richard Sproat,
"A Formal Computational Analysis of Indic Scripts", International Symposium
on Indic Scripts: Past and Future, Tokyo, December 2003.
Richard Sproat. "Brahmi-Derived Scripts, Script Layout, and
Phonological Awareness." Written Language and Literacy,
2006.
Steve Farmer, Richard Sproat and Michael Witzel.
"The Collapse of the
Indus-Script Thesis: The Myth of a Literate Harappan
Civilization". Electronic Journal of Vedic Studies, 11(2), 2004
Tao Tao, Su-Youn Yoon, Andrew Fister, Richard Sproat and ChengXiang
Zhai. "Unsupervised Named Entity Transliteration Using Temporal and
Phonetic Correlation."
EMNLP,
July 22-23, 2006, Sydney, Australia.
Richard Sproat, Tao Tao and ChengXiang Zhai.
"Named Entity Transliteration with Comparable Corpora".
ACL 2006, July 17-21,
2006, Sydney, Australia.
Alla Rozovskaya, Richard Sproat and Elabbas Benmamoun. "Language
Modeling of Arabic Dialects". Colloquium on Arabic Language
Processing, June 5-7, Rabat, Morocco.
I have worked on the prediction of accent placement in complex
nominals in English both from a linguistic perspective
and on applications to text-to-speech synthesis.
Osamu
Fujimura and I used the
X-Ray Microbeam Facility at the University of Wisconsin, Madison,
to study the allophonic variation in English /l/ in different prosodic
and syllable positions. Most previous descriptions had treated
(syllable-initial) "light" /l/ and (syllable-final) "dark" /l/, as a
categorical distinction to be handled, e.g., by low-level phonological
rules. We argued, in contrast, that the variation could be entirely
explained by contextual factors, such as syllable position and
prosodic contexts. The following paper came out of those experiments:
I have worked, mostly with Chilin Shih, on various problems in Chinese
linguistics and computational linguistics. Some example publications:
I wrote a text book on computational morphology that has had a fair
amount of use, though it's a bit out of date now:
My interests in corpus-based methods have mostly been in applying
techniques to new linguistic or computational linguistic problems,
rather than in developing new techniques. Some examples:
My PhD thesis was on morphology, as have been various other
publications:
I haven't done a lot of work in this area, though this is something
I'm planning to get into more. I will be on the dissertation committee
of Martin Jansche,
who is working on this topic.
Actually this is one area I don't have any publications in, because I
just started. Basically what I am doing is looking for useful
information in large speech corpora, which might, for example, be
customer calls to customer care representatives. The difference
between speech mining and text mining is that speech recognition is
often very bad. Of course, one answer is to improve recognition, but
the reality is that when you move to a new task or a new domain you
will likely be faced with poor recognizer performance. I am
investigating methods that might help compensate for this poor
recognizer performance.
I used to work on this, but it's been a while.
This has been one of my main areas of contribution. I was the main
person responsible for the multilingual text processing module of the
Bell Labs Multilingual
TTS System. I have been particularly interested in finite-state
methods in text analysis, and I developed a toolkit called
lextools,
which is freely available for non-commercial use. A version of
lextools was used in the Bell Labs work, and it is being used in
various text-analysis applications at AT&T, as well as elsewhere;
e.g. in the DeKo project
at Stuttgart.
In late 1999 I teamed up with a 3D graphics expert, Bob Coyne, to
develop a system, called WordsEye,
that converts English descriptions of scenes into 3D graphical
representations of those scenes. This has been one of the most
interesting projects I've worked on over the last few years, but
unfortunately neither it, nor Bob, survived the recent downsizing at
AT&T Labs.
While most of my work in TTS has focused on text analysis (with some
work on accent prediction), I have also contributed to other
areas. Actually, one of the first serious bits of work in TTS I did
was to redesign and reimplement a modular architecture for the
AT&T Bell Labs (English) TTS system:
I am interested in understanding the
formal relationship between language and its encoding in writing. I
have a book on this topic:
List of known
errata