Improvement of Throat Microphone Speech by Enhance Spectral Envelope using GMR-LPC based Method
Throat microphone (TM) is a body-attached transducer that is worn against the neck. For the bone-conducted sensor, the Throat Microphone (TM) recorded speech lost intelligibility and naturalness. In this work, the quality and intelligibility of Throat Microphone (TM) speech are improved by artificial bandwidth extension using GMR-LPC based Method. The proposed study consists of two phases. Initially in the first phase, the high band spectrum is estimated as a mixture of down sampled LP residual of TM speech and a modulated white Gaussian noise. Then in the next phase, the lost spectral content of Throat microphone (TM) speech in the high band at 4–8 kHz is estimated using a Gaussian mixture Regression (GMR) mel spectrum extension with a discrete wavelet transform (DWT) filter bank implementation. For that the high band mel spectrum is retrieved from the high frequency spectral components present in the low band and GMR is used for machine learning. Then the spectrum is divided into sub bands using DWT and each band is weighted by the spectrum obtained in the earlier phase to realize the estimated high band. Therefore, the artificial bandwidth extended output is obtained as the addition of the high band and the original Throat Microphone (TM) speech. Experiments are performed to evaluate the proposed method scheme using objective and subjective tests. Both tests yield that the proposed technique improves the quality and intelligibility of Throat Microphone (TM) speech. Keywords - Speech enhancement, Throat Microphone, Gaussian Mixture Regression, Mel Frequency Cepstral Coefficient, Wavelet Decomposition, Linear Prediction Residual.