Modelling hesitation for synthesis of spontaneous speech.pdf
文本预览下载声明
Modelling Hesitation for Synthesis of Spontaneous Speech
1 1,2 3
Rolf Carlson , Kjell Gustafson and Eva Strangert *
1CSC, Department of Speech, Music and Hearing, KTH, Stockholm, Sweden
{rolf;kjellg}@speech.kth.se
2Acapela Group Sweden AB, Solna, Sweden
3Department of Philosophy and Linguistics, Phonetics, Umeå University, Sweden
strangert@ling.umu.se
*Names in alphabetical order
With a few exceptions, relatively little effort has so far
Abstract
been spent on research on spontaneous speech synthesis with
The current work deals with the modelling of one type of dis- a focus on disfluencies. The introduction of the VoiceFont [2]
fluency, hesitations. A perceptual experiment using speech pointed to the need to include extralinguistic features of this
synthesis was designed to evaluate two duration features type in speech synthesis. Methods based on unit selection can
found to be correlates to hesitation, pause duration and final naturally include some of the hesitation features present in
lengthening. A variation of F0 slope before the hesitation was spontaneous speech but this is mostly by accident. In recent
also included. The most important finding is that it is the
显示全部