| As a very important part of human-computer interaction,text input is almost indispensable for the use of computers.The effective input of Chinese characters has always been the most basic requirement for people using Chinese to use computers,and this demand is increasingly pressing.Since its appearance,deep learning has made great progress in machine vision,speech recognition,Q&A,machine translation(MT)and other fields.This paper research the input method based on deep learning and hope to make Chinese text input more accurate and efficient through deep learning technology.The research content of this article can be summarized as follows:(1)Traditional input method model research and improvement.Almost all popular input methods now use Hidden Markov Models(HMM)for pinyin-Chinese conversion.This article uses RNNs to improve the state transition process of the HMM model.The accuracy of the conversion in the pinyin-Chinese conversion task increased by 5 percentage points.pinyin-Chinese pinyin-Chinese pinyin-Chinese(2)Input method based on deep learning model.The pinyin-Chinese conversion process can be seen as a process of sequence labeling.The input sequence is pinyin string and the label is the Chinese character.This paper studies the use of the bidirectional LSTM model and implements a pinyin-Chinese conversion system from the perspective of sequence annotation.If pinyin and Chinese character are regarded as two independent languages,the process of pinyin-Chinese conversion can be regarded as a process of machine translation.This paper uses the Seq2 Seq model used in the field of machine translation to implement pinyin-Chinese conversion.(3)Study of the Seq2 Seq model.This paper studies the effect of the Attention mechanism on the Seq2 Seq model,and studies the effect of the two Attention mechanisms,namely Global Attention and Local Attention,on the pinyin-Chinese conversion task of the Seq2 Seq model.In addition to this we propose the Seq2 Seq model using CNNs as encoder and LSTMs as decoders.This model combined the Local Attention and position embedding techniques to increase the accuracy of the pinyin-Chinese conversion experiment by 3 percentage points compared to the baseline model without Attention mechanism.(4)Based on the study of pinyin-Chinese conversion,this paper completes a input method kernel system.The system includes pinyin string cuts,whole sentence conversions,candidate words,candidate words,candidate phrase generation and interface display parts.In addition,input method system implemented in this article crawls new media data on the Internet 24 hours a day,and has the function of new word mining and the automatic updating of the input method model.Through continuous new word discovery and continuous automatic iterative updating of the model,it is possible to continuously learn new usages of new words on the Internet and new features that are continuously derived from Chinese language so that users can input Chinese sentences with the latest features through pinyin.Compared with the current WI input method,the latest input model of this paper is nearly 5 percentage points higher than the WI input method in the input of the whole pinyin sentence,and nearly 4% higher in the input of the whole sentence of the square grid input method. |