Font Size: a A A

Research Of Indoor Speaker Recognition Based On VQ And Implemented On FPGA

Posted on:2020-11-03Degree:MasterType:Thesis
Country:ChinaCandidate:S ChenFull Text:PDF
GTID:2392330572494859Subject:Electrical engineering
Abstract/Summary:PDF Full Text Request
As a natural and efficient control mode,voice control has been paid more and more attention with the development of smart home.In the application of smart home,speaker recognition is a key technical problem,which directly affects the safety of people and property.At present,the research of speaker recognition system is mostly based on computer platform,which has low timeliness and is limited in practical application.This paper studies the algorithm of indoor speaker recognition and the hardware implementation of the speaker recognition.Using the characteristics of fast speed and low power consumption of the FPGA,the indoor speaker recognition system oriented to voice control is studied and implemented.Indoor speaker recognition consists of voice activity detection and vector quantization recognition.Voice activity detection is the basis of speaker recognition system.An improved energy statistical complexity algorithm is proposed in this paper.After the FFT of the speech signal,the information entropy of one frame of data can be calculated only by the spectral energy and its logarithm of the data of the first half of the frame.The calculation process of probability density is omitted.Then the statistical complexity value is calculated according to the information entropy.Finally,the energy statistical complexity is obtained by combining speech energy.The improved energy statistical complexity algorithm can pipeline the speech signal.It reduces the amount of calculation,improves the efficiency of data processing and is more suitable for application on hardware platform.The hardware implementation of vector quantization recognition consists of feature parameter extraction,Euclidean distance calculation and minimum distortion calculation.The characteristic parameter is 24-dimensional Mel frequency cepstrum coefficient.Mel filter and discrete cosine transform are both involved in operation by look-up table.The square error of Euclidean distance between the characteristic parameters of each frame and each code vector in the codebook is calculated.By timing control,the original 24 square operation modules are reduced to 6,which saves hardware resources.The minimum value of Euclidean distance square error is selected from each frame,which is superimposed with the minimum error value of the previous frame.At the end of voice,the minimum error value of superimposed is divided by the number of effective frames to obtain the minimum distortion of speech.Comparing the minimum distortion value with the preset threshold,the speaker recognition is finally realized.This paper takes EP4CE55F23C8 chip of ALTERA company as the core.Speaker recognition is realized on hardware platform by pipeline technology.The experimental results show that:The system's voice endpoint detection has good detection effect in high and low signal-to-noise ratio environments.The number of effective voice frames measured is accurate.The response time of the system is 96 ms.In the laboratory environment,when text-related speech tests are conducted on designated speakers,the recognition accuracy can reach 94%.Compared with setting the upper threshold of speaker recognition only,setting the upper and lower thresholds at the same time can effectively reduce the misrecognition rate of other voices of the designated person.The system has the advantages of high efficiency,fast response and strong applicability.It has good application prospects in the field of smart home.Figure [56] Table [9] Reference [63]...
Keywords/Search Tags:speaker recognition, voice activity detection, energy statistics complexity algorithm, vector quantization, FPGA
PDF Full Text Request
Related items