| In recent years,due to the concept of intelligent navigation and the vigorous development of the intelligent shipbuilding industry,the application of natural language processing technologies such as speech recognition and speech synthesis has gradually made speech information transmission an important way of human-computer interaction,especially in unmanned ship-shore interaction,ship intelligence,it shows more generalized characteristics and brings greater application value.This topic mainly studies the establishment of a closedloop speech chain speech recognition system based on deep learning and its application method under ship manipulation.The main research work of this thesis is as follows:(1)In view of the lack of public speech dataset,the discrete location of available information points and the small amount of speech material collected,the composition of the nautical corpus was studied and analyzed.By collecting the original voice and self-recorded voice in the cockpit and surrounding cabins of the teaching ship "Yu Kun" of Dalian Maritime University in the field,we have optimized the data processing and structure design of the corpus and expanded the nautical voice training corpus.(2)To address the problems of data pre-processing and complex and error-prone training process of traditional speech recognition models,we design a nautical speech stream online recognition architecture.This thesis adopts the emerging end-to-end speech recognition technology to replace the traditional speech models for direct speech-to-text conversion and establishes a speech recognition framework based on RNN-T(Recurrent Neural Network Transducer)with variable long input.Experiments show that the model can acquire more semantic information,faster and smaller than other transcription models,and meet the requirements of ship manipulation for immediate acquisition of real-time speech commands.(3)Construction of a closed-loop speech chain model based on ASR-TTS.Due to the special characteristics of the maritime field,this thesis builds a speech recognition system with a "closed-loop with feedback" mechanism and introduces the concept of Dual Learning to establish a pairwise learning model between speech recognition(ASR)and speech synthesis(TTS),which can optimize the recognition paradigm of single speech The model optimizes the recognition paradigm of single speech recognition system and can significantly improve the conversion efficiency and transcription accuracy of speech recognition.(4)Robustness study of speech recognition system.The ship is mainly affected by waves,noise interference of the hull,and continuous electromagnetic noise of the machine in the cockpit during the driving process.And the speech robustness subject to noise interference cannot meet the demand of speech recognition for stability.In this thesis,the signal space perspective is selected to analyze the method to improve the speech recognition accuracy in the noise-containing environment,and the speech enhancement algorithm is used to improve the anti-noise capability of the system and enhance the robustness of the system in the noisy environment.(5)Research on the functional application aspects of the system.By building out a complete speech stream online recognition system,further design and complete the functional practical application modules for human-computer interaction on ship maneuvering,VHF recognition,and mental health detection of ocean-going seafarers,thus ensuring that the system products can meet the planned recognition functions in the ship maneuvering environment.This thesis constructs a 388-hour-long professional corpus(Z-Nautical Corpus,Z-NC)in the field of navigation,and finally realizes a more perfect human-computer interaction system for ship maneuvering speech recognition and system application.The method provides a new solution to further realize ship intelligence,explore new modes of maritime voice maneuvering,and effectively improve the information interaction between the pilot and the ship. |