TY - GEN
T1 - Subspace analysis of spectral features for speaker recognition
AU - Chen, Ling
AU - Man, Hong
AU - Jia, Huading
AU - Wang, Zhiyi
AU - Wang, Lei
AU - Li, Zili
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/9
Y1 - 2014/12/9
N2 - A new front-end feature extraction scheme creating so called LDA-projected magnitude spectrum (L-PMS) features is proposed for speaker recognition systems. Mainstream feature extraction schemes usually use filter-bank or linear predictive coding (LPC) in the process of converting high-dimensional speech data into low-dimensional feature vectors, which may lose important discriminative information for speaker recognition tasks. In this work, the new feature extraction scheme takes log of magnitude spectrum of windowed utterance frames. After variance normalization on the spectral features, linear discriminant analysis (LDA) is applied to create discriminatively more powerful features comparing to the conventional mel-frequency cepstral coefficient (MFCC) features. The new feature was evaluated on the TIMIT and NTIMIT corpora, using vector quantization (VQ) speaker model. The Experiments on all 630 subjects in TIMIT and NTIMIT corpora show that the proposed L-PMS features substantially outperform the conventional MFCC features in the sense of identification rate.
AB - A new front-end feature extraction scheme creating so called LDA-projected magnitude spectrum (L-PMS) features is proposed for speaker recognition systems. Mainstream feature extraction schemes usually use filter-bank or linear predictive coding (LPC) in the process of converting high-dimensional speech data into low-dimensional feature vectors, which may lose important discriminative information for speaker recognition tasks. In this work, the new feature extraction scheme takes log of magnitude spectrum of windowed utterance frames. After variance normalization on the spectral features, linear discriminant analysis (LDA) is applied to create discriminatively more powerful features comparing to the conventional mel-frequency cepstral coefficient (MFCC) features. The new feature was evaluated on the TIMIT and NTIMIT corpora, using vector quantization (VQ) speaker model. The Experiments on all 630 subjects in TIMIT and NTIMIT corpora show that the proposed L-PMS features substantially outperform the conventional MFCC features in the sense of identification rate.
UR - http://www.scopus.com/inward/record.url?scp=84920560096&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920560096&partnerID=8YFLogxK
U2 - 10.1109/FSKD.2014.6980814
DO - 10.1109/FSKD.2014.6980814
M3 - Conference contribution
AN - SCOPUS:84920560096
T3 - 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014
SP - 98
EP - 102
BT - 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014
T2 - 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014
Y2 - 19 August 2014 through 21 August 2014
ER -