| Machine Translation(MT)refers to the process of using a computer to convert one natural language into another.The BLEU value is extensively applied to evaluate machine translation output and is scored on human handcrafted references.However,Quality Estimation(QE)can automatically evaluate machine translation output without access to a humangenerated reference.This paper focuses on the crucial issues facing the mainstream QE model under the framework of deep learning,mainly including the following three aspects:(1)Research on enhancing semantic correlation of QE model.At present,it’s popular to introduce pre-trained models into translation quality estimation tasks.However,since multilingual pre-trained models are usually trained with monolingual corpora in different languages,corresponding vocabularies of different language pairs differ in semantic space.To address this problem,this paper proposes to introduce semantic association processing layer to QE model by integrating semantic similarity score between source and target texts.Experimental results show that the proposed method with enhanced semantic relevance under the concatenation mechanism can significantly improve the performance of MT quality estimation.(2)Research on improving QE performance by using data augmentation strategy.The corpora with annotations for training QE model are often expensive yet scarce in scale.Data augmentation is a direct and effective method to deal with this problem.In this paper,two different methods of data augmentation are proposed.One is the indirect data augmentation method based on the Dropout mechanism,which can randomly drop nodes to construct different representations for the same sentence,so as to exponentially increase QE corpus without constructing supervisory signals.The other is the direct data augmentation method based on the denoising autoencoder.The pseudo-translations as well as the corresponding supervisory signals are reconstructed according to the target-side texts of parallel corpus by the denoising autoencoder.Experimental results show that both methods can effectively improve the performance of MT quality estimation.(3)Research on integrating phrase alignment information into QE model.Existing works show that word alignment can effectively improve the performance of QE,but words are prone to ambiguity due to the lack of context,and the meaning of phrases is relatively clear.Therefore,it is highly possible to reduce negative effects caused by improper word alignment under constraints of phrase alignment.Therefore,this paper proposes to integrate QE model with phrase alignment probabilities which come from reliable bilingual phrases.Experimental results show that the proposed method can significantly improve the performance of MT quality estimation.To sum up,this paper describes a series of researches on machine translation quality estimation from the model,data and alignment granularity aspects.Experimental results on WMT public dataset show that the proposed methods in this paper are very effective and can reach or outperform the current state-of-art QE models. |