WSEAS Transactions on Signal Processing
Print ISSN: 1790-5052, E-ISSN: 2224-3488
Volume 9, 2013
Language and Text-Independent Speaker Identification System Using GMM
Authors: ,
Abstract: This paper motivates the use of Dynamic Mel-Frequency Cepstral Coefficient (DMFCC) feature and combination of DMFCC and MFCC features for robust language and text-independent speaker identification. MFCC feature, modeled on the human auditory system has been the widely used feature for speaker recognition because of its less vulnerability to noise perturbation and little session variability. But the human auditory system also can sensitively perceive the pitch changes in the speech. Therefore adopting the algorithm which integrates the change in speaker specific pitch information in designing the Dynamic Mel scale filter bank exhibit improved effectiveness in speaker identification. The individual Gaussian component of Gaussian Mixture Model (GMM) represents vocal tract configurations that are effective for speaker identification. The performance of the speaker identification system is experimentally evaluated with microphone speech data base consisting of 120 speakers. The experiments examine the speaker Identification Error Rate (IDER) by testing using segments of different lengths and also using text-independent utterances in Tamil and English languages. In comparison with the identification error rate of 5.8% obtained with MFCC-based system and 2.9% with DMFCC system an error rate of 1.2% is obtained when DMFCC feature vectors are added with MFCC feature vectors to form the combined feature. Experimental results confirm that GMM is efficient for language and text – independent speaker identification.
Search Articles
Keywords: Speaker Identification, Mel- scale filter bank, Gaussian filters, Mel Frequency Cepstral Coefficient, Dynamic Mel Frequency Cepstral Coefficient, Gaussian Mixture Model