Anna Margolis, University of Washington; Toyota Technological Institute at Chicago
Mari Ostendorf, University of Washington
Karen Livescu, Toyota Technological Institute at Chicago
We consider methods for training a prosodic classifier using labeled training data from a different genre than the one on which the system will be deployed. Two binary tasks are considered: word-level pitch accent and phrase boundary detection. Using radio news and conversational telephone speech, we consider cross-genre training using acoustic and textual features, and find that acoustic features transfer better than text features in most cases. We also find that a single classifier trained from both genres nearly matches genre-dependent performance. We then consider some simple unsupervised domain adaptation approaches, including class proportion adjustment, sample selection bias correction, and feature normalization. With the exception of class proportion adjustment, which is slightly helpful in one case but proves unstable, none of the approaches improve cross-genre performance over the baseline.