Font Size: a A A

Interagting Head-Modifier Based Word-Level Reordering Model For Phrase-Based SMT

Posted on:2012-06-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S LiuFull Text:PDF
GTID:1115330362950149Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Machine translation is classic and long-history in research community. From wordslevel to phrases, from rules to statistic methods, the topic gained obvious progress inrecent years, which is becoming a mature area and attracts more and more attention.In this paper, we proposed a novel method to integrate parse tree to phrase basedstatistic machine translation (PBSMT). Based on this method, an SMT system is imple-mented. The system uses parse tree as input and PBSMT model as its main frame. Indecoding, the system maps the parse tree to sets of head-modifier dependency relationpairs, which is further utilized in improving the reordering model of PBSMT.Specially, this paper contains the following parts:(1) An lexicalized head-driven parsing is selected to generate parse tree as input.The data sparse is the key point of the parser, which is always addressed by smooth tech-nologies. Based on the classic interpolation methods, we proposed an average statisticalevent based principle and proved the correctness with classic error theory. Using the prin-ciple and take zero assumption example by other smooth technologies, 4 smooth methodare proposed in head-driven parsing.(2) We gave an word-level reordering model for PBSMT. To integrate the modelto our system, two alignment constraints were proposed. Based on the constraints, themethod of processing of the alignment was gave. Two definitions of reordering were fur-ther proposed based on the processed alignment. Then, the word level reordering modelwas further proposed, and the method of parameter estimation was given as well.(3) Decoding algorithm is the core of our system. A translation state based decodingmethod of PBSMT was proposed in this paper, which contains 1-best decoding methodand n-best decoding method. In decoding, the hypotheses were organized by translationstate. The divergence of translation state was confined to some range in n-best decoding.This approach considers the relationship between model and decoding comprehensively,because of which the performance were obviously improved in 2 data set.(4) The method proposed the method which integrates the head-modifier basedword level reordering model in training and decoding. In training, the shift-reduce methodwas adopted. In decoding, the word index was employed to identify the reordering types. The experimental results show that our method can obviously improve the performanceof baseline.
Keywords/Search Tags:interpolation smooth, N-best generation, word-level reordering, head-modifier relationship
PDF Full Text Request
Related items