Loading…
Rapid unsupervised adaptation to children's speech on a connected-digit task
We are exploring ways in which to rapidly adapt our neural network classifiers to new speakers and conditions using very small amounts of speech, say, one or a few words. Our approach is to perform a speaker-dependent warping of the frequency scale by selecting a Bark offset for each speaker. We cho...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We are exploring ways in which to rapidly adapt our neural network classifiers to new speakers and conditions using very small amounts of speech, say, one or a few words. Our approach is to perform a speaker-dependent warping of the frequency scale by selecting a Bark offset for each speaker. We choose the offset for a speaker to be the one that maximizes our recognizer output score on the adaptation utterance. We then use the speaker's offset during evaluation of all other utterances by the speaker. To test our approach, we evaluate an adult-speech trained recognizer on children's speech from the same task both before and after adaptation to each child's voice. Using only a single digit for adaptation, we have reduced the word error rate for children's speech from 9.6% to 4.2%. Using a seven-digit utterance further reduced the error rate to 3.5%. |
---|---|
DOI: | 10.1109/ICSLP.1996.607809 |