Subspace analysis of spectral features for speaker recognition

Ling Chen, Hong Man, Huading Jia, Zhiyi Wang, Lei Wang, Zili Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

A new front-end feature extraction scheme creating so called LDA-projected magnitude spectrum (L-PMS) features is proposed for speaker recognition systems. Mainstream feature extraction schemes usually use filter-bank or linear predictive coding (LPC) in the process of converting high-dimensional speech data into low-dimensional feature vectors, which may lose important discriminative information for speaker recognition tasks. In this work, the new feature extraction scheme takes log of magnitude spectrum of windowed utterance frames. After variance normalization on the spectral features, linear discriminant analysis (LDA) is applied to create discriminatively more powerful features comparing to the conventional mel-frequency cepstral coefficient (MFCC) features. The new feature was evaluated on the TIMIT and NTIMIT corpora, using vector quantization (VQ) speaker model. The Experiments on all 630 subjects in TIMIT and NTIMIT corpora show that the proposed L-PMS features substantially outperform the conventional MFCC features in the sense of identification rate.

Original languageEnglish
Title of host publication2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014
Pages98-102
Number of pages5
ISBN (Electronic)9781479951482
DOIs
StatePublished - 9 Dec 2014
Event2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014 - Xiamen, China
Duration: 19 Aug 201421 Aug 2014

Publication series

Name2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014

Conference

Conference2014 11th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2014
Country/TerritoryChina
CityXiamen
Period19/08/1421/08/14

Fingerprint

Dive into the research topics of 'Subspace analysis of spectral features for speaker recognition'. Together they form a unique fingerprint.

Cite this