Font Size: a A A

Speech Recognition And Application Research Of Intelligent Mining Robot

Posted on:2022-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:B WangFull Text:PDF
GTID:2568306839483224Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the development of modern technology,people not only need speech as a means to transfer information to each other,but they also need to transfer information between them and machines by speech in order to make their life and production more convenient.To enable machines to understand human speech is an important part.In order to meet this demand,speech recognition came into being.In actual speech recognition process,it is often in a noisy environment,such as the speech recognition of an intelligent mining robot that this paper will solve.Due to the effect of noise on the characteristics of speech input,the accuracy of speech recognition will be greatly declined.Considering engineering needs and engineering safety,the scheme developed in this paper is to recognize instruction words,which is also called isolated word recognition in academics.Firstly,aiming at the speech recognition model in noisy environment,this paper introduces the classic HMM-GMM model and the GRU-based LAS model,both of which are based on sequence reasoning for speech recognition.Taking into account the influence of noise on the speech signal,it causes inference errors and leads to the failure of the entire speech recognition task.This paper proposes the use of image recognition-based instruction word speech recognition model,that is,convolutional neural network model.The speech signal is processed into an image,and the CNN model’s strong ability to tolerate noise in the image is used to solve the speech recognition problem of the instruction word of the intelligent mining robot in the noise environment.Secondly,by analyzing the actual project,it is concluded that the online command word speech recognition of the intelligent mining robot under the noise environment needs to be completed.In a noisy environment,online speech recognition requires speech endpoint detection,that is,distinguishing the speech signal from the noise signal.Aiming at online speech endpoint detection in noisy environments,this paper proposes a twoclassification network based on CNN training to determine whether the input of the speech recognition system is a speech signal.Using this method,online speech endpoint detection in a noisy environment is implemented,which provides a guarantee for online speech signal collection.Thirdly,in view of the input problem of non-instruction word speech signals that may exist in actual projects,the speech recognition system needs to reject the non-instruction word voice input,that is,the open set recognition problem.In order to reject the speech input of non-instruction words,this paper adopts the classic threshold method and unknown class method on the basis of the CNN speech recognition model.However,due to the limitation of the non-linear mapping nature of the deep learning model,the error of the feature vector to the prediction output in CNN and the unpredictability of the unknown class result in the poor effect of the threshold method and the unknown class method.In order to solve the problem of open set recognition,this paper adopts the feature vector method,and uses the feature vector in CNN to fit the probability distribution of each class,which is used to reject the speech input of non-instruction words.Compared with the threshold method,the rejection rate is greatly improved.Compared with the unknown class method,the advantage is that it does not require unknown class samples to train the model.Finally,aiming at the online instruction word speech recognition experiment of intelligent mining robot in noisy environment,the hardware composition and connection of the experiment are introduced,and the software is realized in the experiment.The experiment includes the instruction word speech recognition experiment in a non-noise environment,and the instruction word speech recognition experiment in a noisy environment.In these two sets of experiments,HMM-GMM,LAS model and CNN model are used.By comparing and analyzing the experimental results,the evaluation of each method is given.Similarly,the experiments and results of speech endpoint detection based on the two-classification network trained by CNN are given.Regarding the experiment of open set recognition,three methods of threshold method,unknown class method and eigenvector method were carried out respectively.By comparing and analyzing the experimental results,the evaluation of the method was given.
Keywords/Search Tags:Speech Recognition, Intelligent Mining Robot, CNN, Speech Endpoint Detection, Open Set Recognition
PDF Full Text Request
Related items