Font Size: a A A

A Study On Lexical And Syntactic Errors In Machine Translation

Posted on:2018-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:J YinFull Text:PDF
GTID:2335330512984792Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Machine translation(MT)is a process of employing the computer to translate a natural language to another one.As a research branch of Computational Linguistics,MT is established on the foundation of Linguistics,Mathematics,and computing technology.In light of the burgeoning Internet and information technology,MT has gained extensive application.Nevertheless,the quality of MT is not so high as to satisfy users' needs.Lexical and syntactic errors are key contributors to low accuracy of MT,so comprehensive analyses of lexical and syntactic errors are conductive to find ways of improving MT.To investigate lexical and syntactic error situations in English-Chinese(E-C)MT of news texts,this study focuses on the frequency and typical patterns of each error category in outputs of three commonly-used translation tools.Besides,causes of these errors are discussed from the linguistic perspective in combination with certain MT principles.Lastly,suggestions on improving MT are raised.The author sampled 15 pieces of English news from The New York Times and got 45 pieces of translation outputs via Google Translate,Bing Translator and Youdao Translation,identified and counted lexical and syntactic errors in the outputs.The software SPSS 21.0 is employed to do quantitative comparative analyses of error frequencies within and among three translation tools,and examples are listed to analyze typical patterns of errors qualitatively.Research results indicate that three translation tools show great similarities in lexical and syntactic errors in English-Chinese MT of news texts.22 types of errors emerge in outputs of three tools;word order error,mistranslations of preposition phrase,polyseme,and verb phrase are at the highest frequencies;the frequencies of syntactic errors are significantly higher than lexical errors,resulting from principles of the statistical machine translation system;three translation tools are not significantly distinct in total frequencies of lexical errors and syntactic errors and frequencies of 19 subcategories of errors.Apart from similarities,the author discovers that there are significant differences among three translation tools in adding word error,word pretermission,and verb phrase mistranslation,as a result of respective features of different translation tools.This research has also described and analyzed typical patterns and causes of these errors: on the one hand,English and Chinese are different in language systems,and Chinese people and Westerners vary from culture backgrounds to thinking patterns,resulting in distinct expressing habits;on the other hand,MT tools possess some shortages,like incomplete corpora,out of context,limited linguistic analyses and inaccurate phrase segmentation.In the end,the author raises some suggestions on improving MT and conceives a hypothetical MT model.
Keywords/Search Tags:E-C machine translation, lexical errors, syntactic errors, news texts
PDF Full Text Request
Related items