Jordi Adell, Antonio Bonafonte, David Escudero-Mancebo, Univeristat Politècnica de Catalunya
In the present paper we present a new approach to the synthesis of filled pauses since they are as frequent as most frequent words in conversational speech. The problem is tackled from the point of view of disfluent speech synthesis. Based on the synthetic disfluent speech model, we analyse the features that describe filled pauses and propose a model to predict them. The model was implemented and perceptually valuated with successful results.