Capturing Inter-speaker Invariance Using Statistical Measures of Rhythm

Tae-Jin Yoon, McMaster University

Statistical rhythmic metrics are applied on a Buckeye corpus of spontaneous interview speech in order to investigate the extent of rhythm variability of between-speakers as well as the variability of within-speaker. The corpus consists of speech produced by speakers who share the same regional dialect in North America. The Buckeye corpus is unique in the sense that the speech dataset is obtained from the speakers who have been raised in the same region and hence who share the same dialect from each other. Statistical measures of rhythm metrics are obtained from each of 10 speakers. The results show that the rhythmic measures that capture the least dialectal variance is the normalized pair-wise variability indices calculated based on adjacent consonantal duration and vocalic duration. The finding implies that these statistical measures of rhythm can be used in capturing the dialectal similarities.