Research On In-car Speech Recognition Based On One-dimensional Convolutional Neural Networks

Posted on:2018-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:X X Zhu

Full Text:PDF

GTID:2348330515492881

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of economy,the use of cars is becoming increasingly popular.While the number of cars has increased,people put forward the requirements for safety and convenience in driving.To meet these requirements,many on-board devices choose speech as the main way that drivers interact with car,because speech is one of the most efficient way of communication among human beings.In this process,the key technology that affects the user experiences is speech recognitionSpeech recognition technology has experienced several decades of development,the combination of neural networks makes its performance been greatly improved.Convolution neural networks(CNNs)have been widely used in the field of speech recognition because of its excellent abilities of local observation and high level aggregation,whereas the architecture of traditional convolution neural networks is two-dimensional(2D),which can not reflect the one-dimensional characteristics of speech signal.Therefore,a one-dimensional(1D)architecture for speech recognition in car environments is proposed,which can better satisfy the temporal variation while retaining band correlation by convolution along the time axis.In addition,the research on pre-processing algorithms of speech recognition system is conducted in this thesis.The main work is as follows:(1)On the basis of analyzing the vehicle noise characteristics and convolution-mixed acoustic environment,the spectral subtraction based on multi-taper spectrum estimation and the independent component analysis(ICA)based speech enhancement algorithms that apply to in-car environment are studied in this thesis,and their effectiveness is proved by simulation experiments.In order to solve the problem of poor performance in car environments of commonly used voice activity detection(VAD)algorithms,this thesis proposes a weighted power spectrum based VAD algorithm.Firstly,the distribution coefficients of spectrum energy of noise are estimated,and then the spectrum energy weighting coefficient of each subband is calculated by the weighting function.By adjusting the spectrum energy of different subbands,the discrimination between the noise and the speech signal in the power spectrum is increased.Experimental results demonstrate that,the proposed VAD algorithm has better performance in car environments,the detection accuracy is about 23%higher than other algorithms in different SNR conditions.(2)Through theoretical analysis and experiments,it is proved that MFCC has higher robustness and anti-interference ability than LPCC in car environments.Meanwhile,the commonly used speech recognition algorithms including DTW,HMM and BP neural networks are studied.(3)According to the one-dimensional characters of speech signal,a one-dimensional architecture for speech recognition in car environments is proposed.Compared with the two-dimensional architecture,the convolution kernel in one-dimensional convolutional neural networks is a vector which equals to an observation window covering the time-axis of speech signal.One-dimensional kernel helps to extract the local features of signals to guarantee its time variation and band correlation.Experimental results show that the recognition performance of one-dimensional convolutional neural networks is better than that of two-dimensional convolutional neural networks and other common speech recognition algorithms in both quiet environment and in-car environment.(4)The influence of the structure parameters of the one-dimensional cconvolution neural networks on the recognition rate is analyzed experimentally.Considering the specific influence of different convolution kernel lengths on the recognition performance in different SNR environments,the adaptive selection of networks structure based on front-end noise estimation is proposed.Finally,a speech recognition system based on one-dimensional convolutional neural networks is built on Matlab platform,and the effectiveness of the algorithm is verified.

Keywords/Search Tags:

convolutional neural networks, speech recognition, in-car environment, voice activity detection, weighted power spectrum

PDF Full Text Request

Related items

1	Convolutional Neural Networks For Voice Activity Detection
2	Research And Implementation Of Voice Activity Detection Algorithm Based On Convolutional Neural Networks
3	Research On Noisy Voice Activity Detection Method
4	High Robust Low Power Voice Activity Detection Design Based On DNN
5	Research On Multimodal Voice Conversion Under Adverse Environment Using Deep Convolutional Neural Network
6	Research On Chinese Continuous Speech Recognition In Noisy Environment
7	Research On Speech Recognition Based On Convolutional Neural Networks
8	Research On Low-SNR Speech Enhancement In Driving Environment
9	Research On Signal Preprocessing For Iteractive Speech Recognition
10	Study On The Voice Activity Detection Method In Low SNR Environment