Improvements of the voice activity detector in AMR-WB

In speech coding one can make use of the speech inactivity to reduce the average bit-rate of the encoded signal. This demands a process commonly referred to as Voice Activity Detection (VAD) that separates the speech frames from the frames that only contains background noise. The purpose of the VAD is to tell the speech encoder to stop or reduce the data flow when no speech is present. The goal with such a process is to lower the average bit-rate without affecting the perceived speech quality.

This work is an investigation and evaluation of possible improvements of the voice activity detector in the Adaptive Multirate Wideband (AMR-WB) speech coder. The purpose of the work was to reduce the sensitivity to babble background noise and improve the performance for detection of music. In the report there is a brief introduction to the theory of speech coding and VAD followed by the outline of the AMR-WB speech coder. The main part of this thesis discusses possible improvements of the detector starting with recent findings in the Adaptive Multirate Narrowband (AMR-NB) algorithm.

Based on the limited material used for evaluation in this work the modifications proposed for the AMR-NB VAD showed good results also for AMR- WB. It turned out however that additional modifications should be done in order to ensure reliable detection of high level non-stationary noises. A music hangover solution was also suggested for better handling of music when the suggested modifications are implemented. The solution suggested for reduction of the sensitivity to babble noises offers a compromise between voice activity and speech clipping that can be tuned to desired performance.

The results and conclusions in this thesis are based on objective tests of limited material and contain no formal subjective testing. The conclusions should therefore be treated as guidance for further studies but indicates that the solutions proposed will reduce the AMR-WB VADs sensitivity to non- stationary background noises

Author: Ekeroth, Andreas

Source: Lulea University of Technology

Download Link: Click Here To Download This Report (PDF)

Reference URL: Visit Now

Leave a Comment