Research On Adaptive Chinese Input Recommendation Algorithm Based On Representation Vecto

Posted on:2024-04-18

Degree:Master

Type:Thesis

Country:China

Candidate:D S Jiang

Full Text:PDF

GTID:2568307130972739

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Chinese input method uses an internal input method engine(IME)to analyze and transform the text data entered by users using algorithms.It provides users with candidate options for selection.Common Chinese IME include Pinyin-to-character(P2C)and Automatic completion of whole sentences(ACWS).P2 C can be further categorized into complete P2 C and abbreviated P2 C.The purpose of P2 C is to convert the user’s inputted Pinyin character sequence into corresponding Chinese strings and recommend them to the user.ACWS,on the other hand,predicts and recommends candidate sentences based on the preceding part of the user’s input.The development of deep learning has promoted the development of IME,but in previous research,P2 C and ACWS models based on neural network models were highly dependent on the composition of the training dataset,and a trained neural network model could not maintain high performance when applied to different users and domains.To solve this problem,this thesis proposes using a method of dynamically storing and updating representation vectors,utilizing the user’s historical information to improve the efficiency and adaptability of neural network models.Two representation vector-based algorithms are designed for P2 C and ACWS to improve the neural network model’s adaptability to different data,the details are as follows:(1)The representation vector-based adaptive P2 C algorithm uses a pre-trained Transformer model to generate representation vectors with semantic information for the Pinyin and Chinese characters in the training set,which are then stored.In the actual user usage process,the retrieved results are normalized by probability and weighted into the output of the Transformer model to form the final probability distribution,which is recommended to the user using the beam search algorithm.Finally,the user’s confirmed input text data is transformed into a representation vector and updated in the data storage to achieve user adaptation.Experimental results on four different style domain datasets confirm that the algorithm-generated IME can effectively track user behavior without the need for further network model training,has strong domain adaptability,and outperforms traditional Pinyin conversion frameworks and commercial IME in multiple metrics.(2)The representation vector-based adaptive ACWS algorithm uses a pre-trained GPT model to convert sentences in the training set into representation vectors for data storage.When the user enters the first half of a sentence,the IME sends it to the GPT model to generate the corresponding representation vector,which is then compared with the stored representation vectors using similarity retrieval.The retrieved results are then normalized by probability and weighted into the output of the GPT model to form the final probability distribution,which is recommended to the user using the language model’s autoregression and beam search algorithm.Finally,the user’s confirmed text data is transformed into a representation vector and updated in the data storage to enhance the neural network model’s adaptability.Experimental results on four different style domain datasets show that the representation vector-based ACWS algorithm can effectively adapt to user behavior while maintaining the performance of the neural network model and improving user experience.

Keywords/Search Tags:

Input method engine, Pinyin-to-character conversion, Automatic completion of whole sentences, Representation vector, User experience

PDF Full Text Request

Related items

1	Design And Implementation Of Intelligent Pinyin Input Method Based On Android Platform
2	Research And Design Of Pinyin Input Method For Chinese Teaching In Pirmary And Secondary Schools
3	A Study On Related Problems Of Chinese Input Method
4	Design And Development Of Pinyin Input Method Of Feature Phone
5	The Key Technology Research And Implementation Of The Pinyin-to-character Convertion System
6	A Pinyin Input Method Editor With English-Chinese Aided Translation Function
7	Research Of Pinyin Input Method For Non-Chinese Native Chinese Learners
8	Design And Implementation Of Pinyin Input Method Client Base On Text Services Framework
9	The Study Of Non-stationary Language Modeling Techniques And Its Practices
10	The Continuous Chinese Pinyin Input System Based On Slide Track