Modeling variability in speech recognition
Autoren
Mehr zum Buch
A major challenge in automatic speech recognition is to achieve good results in tasks where the spoken input is highly variable due to frequent changes of the speaker or of the acoustic conditions. For instance, spoken dialog systems that are connected to the public phone network have to cope with various non-native accents, dialects, speakers of different ages, low-volume speech and varying signal quality. In this work a combination of several approaches is proposed to increase robustness of a speech recognizer. For recognition of children's speech and non-native speech suitable adaptation and normalization methods are developed. Integration of acoustic and linguistic context into the models of the speech recognizer leads to improvements also for those sources of variability that have not or only seldom been observed in the training data. Experimental results are reported for several different data sets, including collections of non-native German and English speech, speech that has been recorded with a spoken dialog system and children's speech.