Font Size: a A A

Research On Tibetan Language Model Based On Neural Network

Posted on:2021-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuoFull Text:PDF
GTID:2415330611459678Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
At this stage,it was an era of informatization and an era of technology.With the rapid development of cloud computing,big data,artificial intelligence and many other fields,neural network language models were also following the pace of the times,and the unique features of neural network language models were also highlighted in terms of speech recognition,optical recognition technology,natural language processing.The performance and effect of the system had surpassed the grammatical and semantic language model modeling in a certain meaning,and also surpassed the statistical-based,traditional N-gram language model modeling.However,in terms of Tibetan language,as a minority language,the lack of research conditions and the lack of training data and many other influencing factors had caused difficulties in research,making the traditional N-gram language model occupy a very important position in Tibetan language research.In this thesis,on the premise of understanding the theoretical knowledge of the language model,combined with the knowledge of the neural network theoretical knowledge,the language model of the Tibetan language was built,and the Tibetan language model based on the neural network was constructed.Through the detailed study of the neural network Experiments,such as:changing the model parameters and other methods to experiment,tounderstand and verify the effectiveness and performance of the neural network language model,selected the traditional statistical N-gram language model based on comparison,and constructed a N-gram and neural network based on the comparison.The purpose of the Tibetan language model was to get a better and more effective Tibetan language model.At the same time,in the course of the research,in order to match the effectiveness and rigor of the proposed method,this thesis not only used the language model's direct evaluation standard confusion as an indicator to evaluate the language model,but also applied the language model to specific applications in order to indirectly observe the quality of the language model by experimenting the correctness of the word-level words on the proofreading of Tibetan texts.In the experiment,by changing the number of hidden layer neurons in the neural network language model and using the context word vector and other features to carry out the experiment,the problem that the language model based on statistics cannot obtain long-distance constraints was solved.At the same time,in the text proofing experiment,the model was also trained by adjusting features such as the number of hidden layer neurons.Finally,in the experimental results,it was found that,compared with the traditional N-gram language model,using the neural network language model,the confusion level was greatly reduced,and the number of neurons in the hidden layer was changed,which also affected the confusion of thelanguage model to varying degrees.On this basis,by embedding word vector features,it was found that the confusion of the language model was reduced,and the performance of the language model was improved accordingly.In the subsequent natural language processing tasks,the neural network language model corrected the word-level word correctness in the specific text proofreading.The experimental results showed that the performance and effect of the experimentally optimized Tibetan neural network language model was better than the traditional statistical N-gram model,and it also improved the performance and effect of the Tibetan language model in specific application text proofing.
Keywords/Search Tags:tibetan, language model, neural network
PDF Full Text Request
Related items