Florian Hönig, Anton Batliner, Karl Weilhammer, Elmar Nöth, Pattern Recognition Lab
We recorded non-native English productions of 55 speakers; a subset of these productions was assessed by 60 native English speakers as for their quality w.r.t. intelligibility, rhythm, etc. Applying multiple linear regression on a large prosodic feature vector - modelling approaches known from the literature as well as generic prosody - we can automatically predict the listener's assessments with correlations of up to .85. We discuss most important features and limitations of this approach.