Font Size: a A A

Research On Non-concurrent Speaker Separation Technology For Corpus Acquisition System

Posted on:2019-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y HeFull Text:PDF
GTID:2438330551460808Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
The past twenty years have witnessed the rapid development of Artificial intelligence technology.In this field,data resources have become the driving force for some research institutes,much importance has been attached to the research on obtaining audio data from the Internet.Most public systems about corpus collection are based on a distributed crawler structure,however,no effective subsystem for speaker diarization is available when crawling non-concurrent speaker voice resources in web pages.In this paper,a gender-based speaker diarization nmethod is studied based on the non-concurrent speaker audio resources.This method is a subsystem in Hadoop-based corpus collection system,audio data from the crawler network are processed by tihe subsystem,and two types of corpus with gender marks are outputs.This speaker diarization method studied in this paper consists of two key steps:speaker segmentation based on Bayesian Infornation Criteria and Universal Background Model(BIC-UBIM)and speaker gender recognition based on deep neural network(DNN).The speaker segmentation method based on BIC-UBM is a two-step decision method about the speaker turning point that carries out the true or false discrimination after detection.It divides the voice signal according to the speaker’s gender.Gender recognition network based on DNN outputs two types of voice sign1a according to the gender after the identification of the segmented speech.Experiments based on the voice library provided by the internship company show that the ilrproved speech segmentation method achieves 94%accuracy in the detection of speaker turning points,missed alarm rate is 6%,false alarm rate is 14%.The accuracy in gender recognition of segrented speech of DNN-based deep neural networks is 96%,witfh a recall rate of 94%for male samples and 98%for female samples.The speaker diarization method studied in this paper meets the requirement of corpus acquisition system well and lays a solid corpus foundation for the project during the intership.
Keywords/Search Tags:Speaker diarization, Crawler network, Bayesian Information Criterion, Universal Background Model, Deep Neural Networks
PDF Full Text Request
Related items