Font Size: a A A

Research On Voice Activity Detection Based On ACAM And Traditional Classification Model

Posted on:2020-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhangFull Text:PDF
GTID:2428330575459195Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of intelligent control technology,the expectations of people for the liberation of hands and the realization of intelligent voice control become more and more urgent,which makes the research of voice technology more and more attention.The development of speech technology is inseparable from the development of speech signal processing technology,and voiceactivity detection technology is the basis of speech signal processing technology.The voice activity detection technology refers to detecting the start and end points of the determined voice from the audio streams,thereby distinguishing voice and non-voice segmentsfrom the audio signal,so that voice technology can be directly applied to voice signals to reduce meaningless operations on non-voice signalsand improve operational efficiency.Therefore,the research of voice activity detection is of great significance to the practical application of speech signals.Audio data is affected by background noise and noise of the acquisition equipment during the acquisition process,which brings huge challenges to voice activity detection.In addition,due to the high cost of labeling audio data,training data is lacking,which also makes it more difficult to detect voice activity accurately.The research topic of voice activity detection in this paper comes from the practical application requirements of atechnology company in Shenzhen,where the author is working as an intern.The products developed by the company require voice activity detection of audio streams collected from the complex environment in real life.The products of the company require high accuracy of voice activity detection,but at the same time they face the problem of insufficient training data.Therefore,this paper develops voice activity detection based on ACAM and traditional classification models.The specific research work is as follows:(1)For the voice activity detection experiments,the classification model is ACAM,and STE +SZCR,MFCC and MRCG are used as audio features.By comparing and analyzing thedetection performance of ACAM under different audio features,MRCG is finally selected as the audio feature of subsequent experiments.(2)For the voice activity detection experiments,the classification modelsare respectively ACAM,SVM and HMM,andMRCG is used as audio feature.By analyzing the detection results of each model,a voice activity detection scheme is proposed that integrates the detection results of ACAM and SVM/HMM.(3)The detection results of ACAM and SVM,ACAM and HMM are fused respectively based on logic OR operation.In the case of less training data,the fusion scheme effectively improves the classification recall rate of the ACAMmodel and detects more real voice data.(4)The detection results of ACAM and SVM,ACAM and HMM are fused respectively based on linear logic regression.In the case of less training data,the fusion scheme effectively improves the classification recall rate of the ACAM model,detects more real voice data,and reduces the loss of voice information.At the same time,compared with the logic OR operation fusion algorithm,the linear logic regression fusionalgorithm does not misunderstand the background noise as speech when retrieving more speech data,so its overall detection performance is better.
Keywords/Search Tags:Voice activity detection, ACAM, SVM, HMM, MRCG, Logic operation, Linear logic regression
PDF Full Text Request
Related items