Caroline Kaufhold, Elmar Nöth, Friedrich-Alexander University Erlangen-Nuremberg
Spoken input of address data in modern GPS units is typically done by filling one information slot after another. To fill-in multiple slots at once, the particular slot information contained in the input utterance has to be extracted. We employ phrase boundaries to separate the speech signal into certain slots. In our evaluation, several types of input utterances differing in the number of slot information and their order are thoroughly examined. For each type, a set of twenty strong prosodic features is trained. By incorporating supporting a-priori features, an F-measure value of 93.0\% is reached for a typical use case.