A comparison of composite features under degraded speech in speaker recognition

A variety of features and their sensitivity to noise mismatch between the model and test noise conditions are assessed. The authors use speaker identification (SI) for a performance evaluation as it is very sensitive to feature changes, and propose a target for robustness in terms of matched noise c...

Full description

Saved in:

Bibliographic Details
Main Authors:	Openshaw, J.P., Sun, Z.P., Mason, J.S.
Format:	Conference Proceeding
Language:	eng
Subjects:	Additive noise Degradation Mel frequency cepstral coefficient Noise figure Noise level Noise robustness Signal to noise ratio Speaker recognition Speech enhancement Testing
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	A variety of features and their sensitivity to noise mismatch between the model and test noise conditions are assessed. The authors use speaker identification (SI) for a performance evaluation as it is very sensitive to feature changes, and propose a target for robustness in terms of matched noise conditions. Two primary features, mel frequency cepstral coefficients (MFCCs) and PLP, are considered along with their RASTA and first-order regression extensions. PLP-RASTA is found to give the best resilience under cross conditions for a single feature, and the linear discriminant analysis (LDA) combination of MFCC and PLP-RASTA gives the best performance overall. Only in combined training are satisfactory results for any feature found.< >
ISSN:	1520-6149 2379-190X