WSEAS Transactions on Acoustics and Music
Print ISSN: 1109-9577
Volume 5, 2018
Weighted Multi-band Summary Correlogram (MBSC)-based Pitch Estimation and Voice Activity Detection for Noisy Speech
Authors: , ,
Abstract: The pitch estimation and Voice activity detection (VAD) is the task of classifying an acoustic signal stream into voiced and unvoiced segments that plays as a crucial preprocessing tool to a wide range of speech applications. In this paper, a weighted multi-band summary correlogram (MBSC)-based pitch estimation algorithm (PEA) as well as voice activity detection (VAD) is proposed. The PEA performs pitch estimation and voiced/unvoiced (V/UV) detection via novel signal processing schemes that are designed to enhance the MBSC’s peaks at the most likely pitch period. This technique computes an independent normalized auto-correlation function (NACF) for each channel or frame which is relatively insensitive to phase changes across channels firstly and then filtered these NACFs to remove a significant portion beyond the pitch range 50-500 Hz and then finding an adaptive threshold from filtered NACFs. This threshold acts as a pitch position indicator and a voiced/unvoiced region detector. The accurate pitch period is obtained from the weighted MBSC. The proposed algorithm has the lowest gross pitch error (%GPE) for noisy speech in the evaluation set among the algorithms evaluated. The proposed PDA also achieves the lowest average voicing detection errors.
Search Articles
Keywords: multi-band summary correlogram, empirical mode decomposition, normalized autocorrelation, voiced/unvoiced speech
Pages: 20-27
WSEAS Transactions on Acoustics and Music, P-ISSN: 1109-9577, Volume 5, 2018, Art. #3