Speech Recognition using Hidden Markov Model

The purpose with this final master degree project was to develop a speech recognition tool, to make the technology accessible. The development includes an extensive study of hidden Markov model, which is currently the state of the art in the field of speech recognition. A speech recognizer is a complex machine developed with the purpose to understand human speech. In real life this speech recognition technology might be used to get a gain in traffic security or facilitate for people with functional disability. The technology can also be applied to many other areas. However in a real environment there exist disturbances that might influence the performance of the speech recognizer. The report includes an performance evaluation in different noise situations, in a car environment. The result shows that the recognition rate varies from 100%, in a noise free environment,to 75% in a more noisy environment.

Reference URL 1: Visit Now

Author: Mikael Nilsson, Marcus Ejnarsson

Source: Blekinge Institute of Technology

Contents

1 Introduction
2 The Speech Signal
2.1 SpeechProduction
2.2 SpeechRepresentation
2.2.1 Three-stateRepresentation
2.2.2 SpectralRepresentation
2.2.3 Parameterization of the Spectral Activity
2.3 PhonemicsandPhonetics
2.4 Summary
3 From Speech To Feature Vectors
3.1 Preprocessing
3.1.1 Preemphasis
3.1.2 VoiceActivationDetection(VAD)
3.2 FrameBlockingandWindowing
3.3 FeatureExtraction
3.3.1 LinearPrediction
3.3.2 Mel-Cepstrum
3.3.3 Energymeasures
3.3.4 Delta and Acceleration Coefficients
3.3.5 Summary
3.4 Postprocessing
4 Hidden Markov Model
4.1 Discrete-TimeMarkovModel
4.1.1 MarkovModelofWeather
4.2 Discrete-TimeHiddenMarkovModel
4.2.1 TheUrnandBallModel
4.2.2 DiscreteObservationDensities
4.2.3 Continuous Observation Densities
4.2.4 TypesofHiddenMarkovModels
4.2.5 Summary of elements for an Hidden Markov Model
4.3 Three Basic Problems for Hidden Markov Models
4.4 Solution to Problem 1 – Probability Evaluation
4.4.1 TheForwardAlgorithm
4.4.2 TheBackwardAlgorithm
4.4.3 Scaling the Forward and Backward Variables
4.5 Solution to Problem 2 – “Optimal” State Sequence
4.5.1 TheViterbiAlgorithm
4.5.2 The Alternative Viterbi Algorithm
4.6 SolutiontoProblem3-ParameterEstimation
4.6.1 MultipleObservationSequences
4.6.2 Initial Estimates of HMM Parameters
4.6.3 NumericalIssues
5 Speech Quality Assessment
5.1 TheClassicalSNR
5.2 TheSegmentalSNR
5.3 ComparisonbetweenSNRmeasures
5.4 TheItakuraMeasure
6 Practical Experimental Results
6.1 Measurementsincar
6.2 Performanceinnoisyenvironment
7 Summary and Conclusions
7.1 FurtherWork
A Phonemes
A.1 Continuant
A.1.1 Vowels
A.1.2 Consonants
A.2 Non-Continuant
A.2.1 Diphthongs
A.2.2 Semivowels
A.2.3 Stops.

Leave a Comment