AT&T Labs - Research
AT&T  

Text-To-Speech (TTS) -- Selected Publications

Home   |   Demo   |   FAQ   |   > Publications   |   Contact
Get the Adobe Acrobat reader plugin for PDF files.

Publications 2012

INTERSPEECH 2012 - PORTLAND, OREGON, SEP. 2012

Text-To-Speech Intelligibility Across Speech Rates [ pdf ]

Word Prominence Detection using Robust yet Simple Prosodic Features [ pdf ]

Predicting Character-Appropriate Voices for a TTS-based Storyteller System [ pdf ]

LREC 2012 - ISTANBUL, TURKEY, MAY 2012

Building Text-To-Speech Voices in the Cloud [ pdf ]

2011

ICASSP 2011 PRAGUE, CZECH REPUBLIC 2011

Using F0 to Constrain the Unit Selection Viterbi Network [ pdf ]

ASSETS 2011, DUNDEE SCOTLAND

On the Intelligibility of Fast Synthesized Speech for Individuals with Early-Onset Blindness [ pdf ]

SIGDIAL 2011 PORTLAND OREGON 2011

Spoken Dialog Challenge 2010: Comparison of Live and Control Test Results [ pdf ]

INTERSPEECH 2011

Enriching text-to-speech synthesis using automatic dialog act tags [ pdf ]

AT&T VoiceBuilder: A Cloud-based Text-to-Speech Voice Builder Tool [ pdf ]

ACL 2011

Predicting Relative Prominence in Noun-Noun Compounds [ pdf ]

2010

SEVENTH SPEECH SYNTHESIS WORKSHOP, KYOTO, JAPAN, SEP. 2010

Composite TTS Voices [ pdf ]

Speech Acts and Dialog TTS [ pdf ]

SLT 2010

Demonstration of AT&T "Let's Go" in the 2010 Spoken Dialog Challenge [ pdf ]

INTERSPEECH 2010

Automatic Detection of Abnormal Stress Patterns in Unit Selection Synthesis [ pdf ]



INTERSPEECH 2008 - BRISBANE, AUSTRALIA, SEP. 2008

Improving Preselection in Unit Selection Synthesis [ pdf ]



SPEECH PROSODY 2008 CONFERENCE - CAMPINAS, BRAZIL, MAY 2008

Dialog speech acts and prosody:   Considerations for TTS [ pdf ]



ICSLP 2006 - PITTSBURGH, PA, SEP. 2006

Six Approaches to Limited Domain Concatenative Speech Synthesis [ pdf ]

Expanding Phonetic Coverage in Unit Selection Synthesis Through Unit Substitution from a Donor Voice [ pdf ]

Phonetically Enriched Labeling in Unit Selection TTS Synthesis [ pdf ]



INTERSPEECH 2005 - LISBON, PORTUGAL, SEP. 2005

Perceptually-based Data-driven Join Costs: Comparing Join Types [ pdf | ps ]



ENCYCLOPEDIA OF LANGUAGE AND LINGUISTICS, 2nd EDITION, 2005

Voice Modification for Applications in Speech Synthesis [ pdf | ps ]



ELECTRICAL ENGINEERING HANDBOOK, 3rd EDITION, 2005

Chapter 16: Text-to-Speech Synthesis [ pdf | ps ]



INTERSPEECH 2004 - ICSLP, JEJU ISLAND, KOREA, OCT. 2004

Pronunciation lexicon adaptation for TTS voice building [ pdf | ps ]



FIFTH SPEECH SYNTHESIS WORKSHOP, PITTSBURGH, USA, JUNE 2004.

Data-Driven Perceptually Based Join Costs [ pdf | ps ]

Improving TTS by Higher Agreement Between Predicted Versus Observed Pronunciations [ pdf | ps ]



146th Meeting of the ASA, AUSTIN TEXAS, NOVEMBER 2003

Exploration of question intonation in read American English [ html ]



ICPhS 2003, BARCELONA, SPAIN, AUGUST 2003

Effects on TTS quality of methods of realizing natural prosodic variations [ pdf | ps ]



ICSLP 2002, DENVER, COLORADO, SEPTEMBER 2002

Expressive speech synthesis using a concatenative synthesizer [ pdf | ps ]

From text to prosody without ToBI [ pdf | ps ]

The AT&T German text-to-speech system: realistic linguistic description. [ pdf | ps ]

Automatic Segmentation Combining an HMM-Based Approach and Spectral Boundary Correction. [ pdf | ps ]



EUROSPEECH 2001, AALBORG, DENMARK, SEPTEMBER 2001

Phonetic effects on listener detection of vowel concatenation [ pdf | ps ]



ICASSP 2001, SALT LAKE CITY, UTAH, MAY 2001

Perceptual and objective detection of discontinuities in concatenative speech synthesis [ pdf | ps ]



VoiceXML Review, March 2001, Features Article #2

The Fundamentals of Text-to-Speech Synthesis [ html ]



SPEECH COMMUNICATION 33 (2001) 135-151, Special Issue on Speech Annotation and Corpus Tools

Automatic ToBI Prediction and Alignment to Speed Manual Labeling of Prosody [ pdf | ps ]



ICSLP 2000, BEIJING, CHINA, OCTOBER 2000

[Invited talk on] Corpus-based techniques in the AT&T NextGen synthesis system [ pdf | ps ]

Inter-transcriber reliability of ToBI Prodosic Labeling [ pdf | ps ]

Perceptual evaluation of automatic segmentation In Text-To-Speech Synthesis [ pdf | ps ]

Perceptually based automatic prosody labeling and prosodically enriched unit selection improve concatenative text-to-speech synthesis [ pdf | ps ]

Preselection of candidate units in a unit selection-based Text-To-Speech Synthesis System [ pdf | ps ]



IEEE PROCEEDINGS, AUGUST 2000.

Speech and Language Processesing for Next-Millennium Communications Services [ pdf ]



IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), NEW YORK, NY, 2000

Multimodal Speech Synthesis [ pdf ]

Efficient Modeling of Virtual Humans in MPEG-4 [ pdf ]

Very Low Bitrate Coding of Virtual Human Animation in MPEG-4 [ pdf ]

Talking Heads and Synthetic Speech: An Architecture for Supporting Electronic Commerce [ pdf ]



ICASSP 2000, ISTANBUL, TURKEY, JUNE 2000

On the Implementation of the Harmonic Plus Noise Model for Concatenative Speech Synthesis [ pdf | ps ]

Stochastic Modeling of Spectral Adjustment for High Quality Pitch Modification [ pdf | ps ]



IEEE SIGNAL PROCESSING LETTERS, VOLUME 7, NUMBER 5, MAY 2000

A simple and fast way for generating a harmonic signal [ pdf | ps ]



EUROSPEECH '99, BUDAPEST, HUNGARY, SEPT. 1999

Prosody Recognition from Speech Utterances Using Acoustic and Linguistic Based Models of Prosodic Events [ pdf | ps ]

Rapid Unit Selection from a Large Speech Corpus for Concatenative Speech Synthesis [ pdf | ps ]

Interaction of Units in a Unit Selection Database [ pdf | ps ]

Synchronization of speech frames based on phase data with application to concatenative speech synthesis [ pdf | ps ]

Detection of non-stationarity in speech signals and its application to time-scaling [ pdf | ps ]

Single Complex Sinusoid and ARHE Model based Pitch Extractors [ pdf | ps ]



JOINT MEETING OF ASA, EAA, AND DAGA, BERLIN, GERMANY, 15-19 MARCH 1999

The AT&T Next-Gen TTS System [ pdf | ps ]       (Paper 2ASCA__4)

Analysis of Voiced Speech using Harmonic Models [ pdf | ps ]       (Paper 5ASCA__2)

Robust Unit Selection for Speech Synthesis [ pdf | ps ]       (Paper 1PSCB_10)



ICASSP-99, PHOENIX, ARIZONA, MARCH 1999

Assessment and Correction of Voice Quality Variabilities in Large Speech Databases for Concatenative Speech Synthesis [ pdf | ps ]



THIRD SPEECH SYNTHESIS WORKSHOP, JENOLAN CAVES, AUSTRALIA, NOV. 1998

Three Methods of Intonation Modeling [ ps ]

Parametric modeling of intonation using vector quantization [ ps ]

Diphone Synthesis using Unit Selection [ ps ]

Concatenative Speech Synthesis using a Harmonic plus Noise Model [ ps ]

Removing Phase Mismatches in Concatenative Speech Synthesis [ ps ]



ICSLP-98, SYDNEY, AUSTRALIA, NOV. 1998

Real Time Voice Alteration Based on Linear Prediction [ ps ]

Exploration of acoustic correlates in speaker selection for concatenative synthesis [ ps ]

Integration of talking heads and text-to-speech synthesizers for visual TTS [ ps ]



ICASSP-98, SEATTLE, WASHINGTON, 1998

TD-PSOLA versus Harmonic plus Noise Model in diphone based speech synthesis [ ps ]



EUROSPEECH '97, RHODES, GREECE, SEPT. 1997

Diphone Concatenation using a Harmonic plus Noise Model of Speech [ ps (5 MB) | ps.gz (800 kB) ]