Font Size: a A A

A Study Of Tibetan-Chinese Machine Translation Method With Knowledge Fusion Of Tibetan Function Words

Posted on:2024-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:S S YanFull Text:PDF
GTID:2555307085970759Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of society and the progress of information technology,people are interacting more closely and it is a necessary necessity to break the language barrier,and machine translation is the way to achieve this.On the one hand,an in-depth study of Tibetan-Chinese machine translation can help non-native Tibetan scholars to understand and grasp Tibetan history and culture,which is conducive to the protection and transmission of cultural heritage;on the other hand,an in-depth study of Tibetan-Chinese machine translation can promote interactions and exchanges between ethnic groups and help national unity,as well as promote the economic development of Tibetan areas and foreign exchanges.Although there are many highlights of Tibetan-Chinese machine translation in terms of corpus constructions and methodological improvements,there is not much research on Tibetan-Chinese machine translation methods from the standpoint of the grammatical characteristics of Tibetan itself.Therefore,this study incorporates the traditional BPE algorithm with the richness of Tibetan function words and grammatical structure characteristics,and investigates the Tibetan-Chinese machine translation method of Transformer,m BART and model integration strategy for Tibetan function word knowledge fusion.The main contributions of this study are as follows:1.Three types of Tibetan function word knowledge fusion methods are proposed.Mainly,they include,(i)Tibetan free function words and unfree function words knowledge fusion methods;(ii)monosyllabic,multisyllabic and all Tibetan function words knowledge fusion methods;(iii)high frequency,medium frequency,low frequency and very low frequency Tibetan function words knowledge fusion methods.2.Using the Transformer and m BART machine translation models,experiments on the knowledge fusion of three types of Tibetan function words were conducted respectively,and it was found that the knowledge fusion of Tibetan multisyllabic function words,low-frequency and very low-frequency function words could improve the translation effect of Tibetan-Chinese machine translation.3.Two model integration strategies,model parameter average and prediction result integration,were used to conduct Tibetan-Chinese function word knowledge fusion experiments.Through the comparison of the two integration strategies,it was found that the prediction result fusion strategy could significantly improve the translation effect of Tibetan-Chinese machine translation model.In conclusion,whether on the two types of Tibetan-Chinese neural machine translation models or the model integration strategies,it has been found through the experiments that the low-frequency Tibetan function word knowledge fusion works best and has more significant improvement.
Keywords/Search Tags:Tibetan function words, knowledge fusion, Transformer model, mBART model, model integration
PDF Full Text Request
Related items