Research Interests

This is a compendium of some of the things I have worked on, with some associated publications. Much of this is quite old though. For a more up-to-date list of what I'm currently involved in, see my home page.

My interests range from the theoretical to the applied. At one time or another I have done work in all of the following areas. Here I list only those areas where I have actually published things, along with a few sample publications in the area. A fuller list of publications can be found on the publications page.

Accent Prediction Articulatory Phonetics Chinese Linguistics Computational Morphology

Corpus-Based Methods Morphology Pronunciation Modeling Speech Data Mining

Syntax Text Analysis/Normalization Text-to-Scene Conversion Text-to-Speech Synthesis

Writing Systems

Named entity transliteration Arabic language modeling ASR for accented speech

Accent Prediction.
I have worked on the prediction of accent placement in complex nominals in English both from a linguistic perspective and on applications to text-to-speech synthesis.
Mark Liberman and Richard Sproat, "The Stress and Structure of Modified Noun Phrases in English," in I. Sag (ed.), Lexical Matters, 131-181, CSLI Publications, Chicago, University of Chicago Press, 1992.
Richard Sproat, "English Noun-Phrase Accent Prediction for Text-to-Speech." Computer Speech and Language, 8, 79-94, 1994.
Articulatory Phonetics.
Osamu Fujimura and I used the X-Ray Microbeam Facility at the University of Wisconsin, Madison, to study the allophonic variation in English /l/ in different prosodic and syllable positions. Most previous descriptions had treated (syllable-initial) "light" /l/ and (syllable-final) "dark" /l/, as a categorical distinction to be handled, e.g., by low-level phonological rules. We argued, in contrast, that the variation could be entirely explained by contextual factors, such as syllable position and prosodic contexts. The following paper came out of those experiments:
Richard Sproat and Osamu Fujimura, "Allophonic variation in English /l/ and its implications for phonetic implementation," Journal of Phonetics, 21, 291-311, 1993.
One thing I'd like to do, given enough time, is return to this work and do a complete survey for all consonants for a language. While there have been detailed articulatory studies of many classes of consonants (e.g. Krakow's work on nasals), to my knowledge nobody has done a complete survey of the prosodic and syllable affiliation affects on the complete range of consonants for English, or any other language.
Chinese Linguistics.
I have worked, mostly with Chilin Shih, on various problems in Chinese linguistics and computational linguistics. Some example publications:
Chilin Shih and Richard Sproat, "Mandarin Text-to-Speech Synthesis," Proceedings of the Eighth North American Conference on Chinese Linguistics, 1997.
Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin Nominal Root Compounds." Journal of East Asian Linguistics, 5, 49-71, 1996.
Richard Sproat and Chilin Shih, "Why Mandarin Morphology is not Stratum-Ordered." Yearbook of Morphology, 185-217, 1993.
Richard Sproat and Chilin Shih, "The Cross-Linguistic Distribution of Adjective Ordering Restrictions," in C. Georgopoulos and R. Ishihara (eds.), Interdisciplinary Approaches to Language: Essays in Honor of S.-Y. Kuroda, 565-593, Dordrecht, Kluwer Academic Publishers, 1990.
Computational Morphology.
I wrote a text book on computational morphology that has had a fair amount of use, though it's a bit out of date now:
Richard Sproat, Morphology and Computation, Cambridge, MA, MIT Press, 1992.
I have done various pieces of work on computational morphology over the years. Note that a lot of the work that I did on Text Analysis for TTS for languages like Russian involved a non-trivial amount of morphology.
Richard Sproat and Barbara Brunson, "Constituent-Based Morphological Parsing: A New Approach to the Problem of Word-Recognition," 25th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, 65-72, 1987.
Elena Pavlova, Chilin Shih and Richard Sproat, "A Text-to-Speech System for Russian," Proceedings of EUROSPEECH, 1997.
Corpus-Based Methods.
My interests in corpus-based methods have mostly been in applying techniques to new linguistic or computational linguistic problems, rather than in developing new techniques. Some examples:
Richard Sproat, "Inferring the Environment in a Text-to-Scene Conversion System", First International Conference on Knowledge Capture (K-CAP '01), Victoria, BC, Canada, 2001.
Harald Baayen and Richard Sproat, "Estimating Lexical Priors for Low-Frequency Morphologically Ambiguous Forms," Computational Linguistics, 22(2), 1996.
Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin Nominal Root Compounds." Journal of East Asian Linguistics, 5, 49-71, 1996.
Richard Sproat and Chilin Shih, "A Statistical Method for Finding Word Boundaries in Chinese Text," Computer Processing of Chinese and Oriental Languages, 4, 336-351, 1990.
Morphology.
My PhD thesis was on morphology, as have been various other publications:

Richard Sproat and Chilin Shih. "A Corpus-Based Analysis of Mandarin Nominal Root Compounds." Journal of East Asian Linguistics, 5, 49-71, 1996.
Richard Sproat and Chilin Shih, "Why Mandarin Morphology is not Stratum-Ordered." Yearbook of Morphology, 185-217, 1993.
Gregory Ward, Richard Sproat and Gail McKoon, "A Pragmatic Analysis of So-Called Anaphoric Islands," Language, 67, 439-474, 1991.
Richard Sproat, On Deriving the Lexicon. MIT Working Papers in Linguistics, Cambridge, MA. 1985. (Published version of Ph.D. dissertation)
Pronunciation Modeling.
I haven't done a lot of work in this area, though this is something I'm planning to get into more. I will be on the dissertation committee of Martin Jansche, who is working on this topic.
Richard Sproat, "Pmtools: A Pronunciation Modeling Toolkit", Proceedings of the Fourth ISCA Tutorial and Research Workshop on Speech Synthesis, Blair Atholl, Scotland, 2001.
Speech Data Mining.
Actually this is one area I don't have any publications in, because I just started. Basically what I am doing is looking for useful information in large speech corpora, which might, for example, be customer calls to customer care representatives. The difference between speech mining and text mining is that speech recognition is often very bad. Of course, one answer is to improve recognition, but the reality is that when you move to a new task or a new domain you will likely be faced with poor recognizer performance. I am investigating methods that might help compensate for this poor recognizer performance.
Syntax.
I used to work on this, but it's been a while.
Richard Sproat, "Welsh Syntax and VSO Structure," Natural Language and Linguistic Theory, 3, 173-216, 1985.
Text Analysis and Text Normalization.
This has been one of my main areas of contribution. I was the main person responsible for the multilingual text processing module of the Bell Labs Multilingual TTS System. I have been particularly interested in finite-state methods in text analysis, and I developed a toolkit called lextools, which is freely available for non-commercial use. A version of lextools was used in the Bell Labs work, and it is being used in various text-analysis applications at AT&T, as well as elsewhere; e.g. in the DeKo project at Stuttgart.
In 1999 I headed a project on Text Normalization at the Johns Hopkins University WS99. While the project was about all aspects of Text Normalization (standard abbreviation expansion, number expansion, currency amounts, etc.), my own particular interest was in how one might infer the expansion of non-standard abbreviations given an unannotated corpus, where the correct answer is to be found somewhere in the corpus. For example, in a corpus of real estate ads, one might find livrm, and infer that the correct expansion is living room based on the occurrence of that phrase somewhere else in the corpus. Various papers related to issues in text analysis are:
Richard Sproat, Alan Black, Stanley Chen, Shankar Kumar, Mari Ostendorf, and Christopher Richards. "Normalization of non-standard words." Computer Speech and Language, 15(3), 287-333, 2001.
Richard Sproat. "Multilingual Text Analysis for Text-to-Speech Synthesis", Natural Language Engineering, 2(4), 369-380, 1997.
Richard Sproat, Chilin Shih, William Gale and Nancy Chang, "A Stochastic Finite-State Word-Segmentation Algorithm for Chinese," Computational Linguistics, 22(3), 1996.
Mehryar Mohri and Richard Sproat. "An Efficient Compiler for Weighted Rewrite Rules", 34th Annual Meeting of the Association for Computational Linguistics: Proceedings of the Conference, 1996.
Some tools that we have produced that are useful for text normalization are Thrax, a finite-state grammar development library and Sparrowhawk a lightweight open-source version of the Google Kestrel text normalization system.
More recently my team and I have been working on neural models of text normalization. See Richard Sproat and Navdeep Jaitly "An RNN Model of Text Normalization," Proceedings of Interspeech, 2017.
Text-to-Scene Conversion.
In late 1999 I teamed up with a 3D graphics expert, Bob Coyne, to develop a system, called WordsEye, that converts English descriptions of scenes into 3D graphical representations of those scenes. This has been one of the most interesting projects I've worked on over the last few years, but unfortunately neither it, nor Bob, survived the recent downsizing at AT&T Labs.
Richard Sproat, "Inferring the Environment in a Text-to-Scene Conversion System", First International Conference on Knowledge Capture (K-CAP '01), Victoria, BC, Canada, 2001.
Bob Coyne and Richard Sproat, "WordsEye: An Automatic Text-to-Scene Conversion system", SIGGRAPH 2001, Los Angeles, CA, 2001.
Future plans for this work are to continue working on it -- somehow. Apart from trying to commercialize this technology, there are also various scientific questions to explore: WordsEye makes for a wonderful testbed for various ideas about language and meaning. One idea that I want to develop is the notion that the basic building blocks of WordsEye -- poses for actions, and so forth -- can be used to build up a large-scale graphics-based lexical semantic database. This idea is, of course, related to the use of pictorial representations in Cognitive Grammar, and it is also related to the work at U Penn on the Actionary.
Text-to-Speech Synthesis.
While most of my work in TTS has focused on text analysis (with some work on accent prediction), I have also contributed to other areas. Actually, one of the first serious bits of work in TTS I did was to redesign and reimplement a modular architecture for the AT&T Bell Labs (English) TTS system:
Richard Sproat and Joseph Olive, "A Modular Architecture for Multilingual Text-to-Speech," Proceedings of The Second ESCA/IEEE Workshop on Speech Synthesis, 187-190, 1994.
Richard Sproat (editor), Multilingual Text-to-Speech Synthesis: The Bell Labs Approach, Boston, MA, Kluwer Academic Publishers, 1997.
I have been involved in markup standards for TTS, including being one of the developers of SABLE:
Richard Sproat, Andrew Hunt, Mari Ostendorf, Paul Taylor, Alan Black, Kevin Lenzo, Mike Edgington, "SABLE: A Standard for TTS Markup", International Conference on Spoken Language Processing, 1998. Also presented at the ESCA/COCOSDA Speech Synthesis Workshop. Jenolan Caves, Australia; and at the W3C Meeting on Voice Browsers, Cambridge, MA, October 13, 1998.
Richard Sproat, T. V. Raman. "SABLE: an XML-based Aural Display List For The WWW". 1999.
I was a participant in the W3C's Voice Browser Working Group.
Mari Ostendorf, Andrew Hunt and I organized the two-day NSF Workshop for Discussing Research Priorities and Evaluation Strategies in Speech Synthesis, held at the National Science Foundation, Arlington, VA, August 6-7, 1998. Here is the final report from that workshop.
I am co-PI on NSF ITR Award #0205731 Prosody Generation for Child Oriented Speech Synthesis
Writing Systems.
I am interested in understanding the formal relationship between language and its encoding in writing. I have a book on this topic:
Richard Sproat, A Computational Theory of Writing Systems, (ACL Studies in Natural Language Processing Series), Cambridge, Cambridge University Press, 2000.
List of known errata
Recently I've been doing some work on rongorongo, the Easter Island script, using approximate string matching techniques to discover parallel texts in the rongorongo corpus.
In recent work, I have explored the layout of South Asian scripts and its relation to psycholinguistic work on phonological awareness:
Richard Sproat, "A Formal Computational Analysis of Indic Scripts", International Symposium on Indic Scripts: Past and Future, Tokyo, December 2003.
Richard Sproat. "Brahmi-Derived Scripts, Script Layout, and Phonological Awareness." Written Language and Literacy, 2006.
Steve Farmer, Richard Sproat and Michael Witzel. "The Collapse of the Indus-Script Thesis: The Myth of a Literate Harappan Civilization". Electronic Journal of Vedic Studies, 11(2), 2004
Named entity transliteration.

Tao Tao, Su-Youn Yoon, Andrew Fister, Richard Sproat and ChengXiang Zhai. "Unsupervised Named Entity Transliteration Using Temporal and Phonetic Correlation." EMNLP, July 22-23, 2006, Sydney, Australia.
Richard Sproat, Tao Tao and ChengXiang Zhai. "Named Entity Transliteration with Comparable Corpora". ACL 2006, July 17-21, 2006, Sydney, Australia.
Arabic language modeling.

Alla Rozovskaya, Richard Sproat and Elabbas Benmamoun. "Language Modeling of Arabic Dialects". Colloquium on Arabic Language Processing, June 5-7, Rabat, Morocco.
Emotion prediction from text.

Cecilia Ovesdotter Alm, Dan Roth and Richard Sproat "Emotions from text: machine learning for text-based emotion prediction." HLT/EMNLP 2005. October 6-8, 2005, Vancouver.
Cecilia Alm and Richard Sproat. "Emotional sequencing and development in fairy tales." First International Conference on Affective Computing and Intelligent Interaction, Beijing, China, Oct. 22-24, 2005.
Cecilia Alm and Richard Sproat. "Perceptions of emotions in expressive storytelling." InterSpeech 2005, Lisbon, Portugal, Sep. 4-8, 2005.
ASR for accented speech.

Yanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Dan Jurafsky, Rebecca Starr, Su-Youn Yoon. "Accent Detection and Speech Recognition for Shanghai-Accented Mandarin." InterSpeech 2005, Lisbon, Portugal, Sep. 4-8, 2005.

Accent Prediction	Articulatory Phonetics	Chinese Linguistics	Computational Morphology
Corpus-Based Methods	Morphology	Pronunciation Modeling	Speech Data Mining
Syntax	Text Analysis/Normalization	Text-to-Scene Conversion	Text-to-Speech Synthesis
Writing Systems
Named entity transliteration	Arabic language modeling	ASR for accented speech