Font Size: a A A

On The Feasibility Of Corpus-based Machine Translation

Posted on:2011-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:J HeFull Text:PDF
GTID:2155330332970587Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
After sixty years of development, machine translation (MT) has made great progress in both academic methods and practical applications. While MT has developed quickly, it must be observed that the quality of MT is not satisfactory, and the understandability and loyalty of its versions are not good. With the further study of the existing rule-based MT it becomes clear it is increasingly more problematic. Simply compiling grammatical rules proves insufficient for modern-day social needs. If, this is the end of rule-based MT, then how about corpus-based MT? Could it push the quality of MT into a new era? If so, then how can corpus-based create a better quality MT? These are all the issues raised and investigated within this report.With two types of translation software:rule-based'Systran' and corpus-based 'Google', based on two particular criteria -- one is proposed by Prof. Feng Zhiwei who is famous for his research on machine translation, and the other one is adopted by Japan National Technology Department which is designed to focus on the understandability and loyalty of MT versions-- the author tries to demonstrate the feasibility of corpus-based MT.Firstly, English-Chinese translation versions of rule-based'Systran' and corpus-based 'Google'are used to do the comparison for testing translation quality, which are compared from the aspects of lexical level, structural level, understandability and loyalty. By analyzing these data, the feasibility of corpus-based MT can be proved; meanwhile, two questionnaires are designed to assist in the comparison.Secondly, taking corpus-based'Google' as a tool, the author makes a comparison to its English-Chinese MT versions of literary works and non-literary works. These are also compared from the aspects of lexical level, structural level, understandability, and loyalty. Furthermore, the comparison between the two MT's shows that the feasibility of corpus-based MT in the translation of non-literary works surpasses that of its rule-based counterpart. Finally, some measures are proposed to improve the quality of corpus-based MT in non-literary works, including hybrid translation method, further classified database and expanded corpus, development of software with better capacity of contextual analysis, readers'post-editing.With the juvenility of evaluation criterion of MT, the author's subjectivity and the limits of time, this research has lots of shortages, such as the lack of translation versions, the limits of participators joining in the questionnaire, etc. However, it is the author's hope that this research may positively stimulate the development of corpus-based MT.
Keywords/Search Tags:corpus-based MT, feasibility, the feasibility of corpus-based MT in non-literary works
PDF Full Text Request
Related items