Font Size: a A A

Research And Application Of Speech Enhancement And Recognition For Civil Aviation Air And Land Calls

Posted on:2024-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:H B KangFull Text:PDF
GTID:2542307079472624Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Speech-related technologies,such as speech enhancement and speech recognition,have a significant impact on the fields that rely on communication dialogue,such as civil aviation,aerospace,remote conference,etc.Taking the field of civil aviation land-air communication as an example,cabin noise during land-air communication,background noise of radio communication,and huge roar of aircraft during flight all form a more complex acoustic scene.The application of voice technology can improve the quality of air and land communication and reduce the working pressure of pilots and controllers,thus minimizing flight safety risks.This thesis mainly uses deep learning technology to explore how to reduce the impact of complex noise on the call quality under the background of civil aviation landair call,so as to obtain a speech recognition model with high recognition rate and robustness,and put the algorithm into practice to build an enhanced recognition system for civil aviation land-air call.In general,this study has the following difficulties: First,the civil aviation field is in a low signal-to-noise ratio condition,and there are many kinds of noise,which will have an impact on the speech enhancement model;Second,Speech enhancement is usually used as the upstream module of speech recognition,and the enhanced speech features will be used as the input of speech recognition.However,speech distortion caused by speech enhancement will significantly reduce the accuracy of speech recognition.Third,due to the particularity of civil aviation,the trainable data is less and the model with high recognition rate cannot be obtained directly.According to the above problems,the main contributions of this thesis are as follows:1)This thesis proposes a speech enhancement algorithm based on the CrossDimensional Collaborative Attention Mechanism(CADNet).A cross-dimensional collaborative attention module is introduced between the encoder and decoder,which fuses the characteristics of the model on both spatial and temporal scales to better control the information and suppress the irrelevant noise parts;The deformable convolution module is introduced to adjust the size or receptive field size adaptively,extract the characteristic information of different sizes or deformations corresponding to different positions,and better match the natural features of the voice print,thus enhancing the ability of information analysis.2)This thesis proposes a speech enhancement recognition joint training model,which includes the above speech enhancement network,gated recursive mechanism and end-to-end speech recognition network.The loss function is used to jointly optimize these three modules to solve the influence of speech distortion on speech recognition.In addition,In order to solve the problem of less training data,this thesis uses the transfer learning method to solve it,pre-training on open data sets,and then transfer model and fine-tuning in the civil aviation field.In order to solve the problem of limited computer resources in the process of joint training,this thesis introduces the dilated self-attention mechanism to reduce the complexity of the model,and finally completes a speech enhancement recognition model based on land and air call of civil aviation.3)Finally,in the land and air scenarios of civil aviation,this thesis designs a speech enhancement recognition system,which integrates the above research algorithms and provides file management functions,speech enhancement result display and speech recognition result display,so as to display the algorithm effect more intuitively.
Keywords/Search Tags:Speech Recognition, Speech Enhancement, Joint Training, Cross-Dimensional Collaborative Attention Mechanism, Dilated Self-Attention
PDF Full Text Request
Related items