| In the field of machine translation,the proper use of phrase information is considered one of the key approaches to improving translation quality.Traditional statistical machine translation models have significantly improved translation quality by modeling phrases rather than words.However,in the field of neural machine translation,research on utilizing phrase information to enhance translation quality is relatively scarce,due to the difficulty in determining phrase boundaries and the challenges in phrase modeling.Most studies focus on token-level neural machine translation systems.In light of this phenomenon,this paper delves into the improvement and application of phrases in neural machine translation models from three aspects,aiming to enhance the translation quality of machine translation models.The main research contents are as follows:(1)We propose phrase-aware adaptive training and phrase dropout mechanism.In neural machine translation models,on the one hand,traditional training objectives attempt to reduce the loss of each word but do not explicitly constrain the model’s memory of phrases,leading to insufficient accuracy in phrase translation.On the other hand,autoregressive decoding-based neural machine translation models may cause inaccurately translated phrases to affect the subsequent phrase translation.To alleviate these two issues and improve phrase translation accuracy,we propose phrase-aware adaptive training and phrase dropout mechanism,respectively.Phrase-aware adaptive training employs a heuristic phrase segmentation method to divide sentences into multiple phrase segments and uses adaptive training objectives to assign appropriate weights to each word,encouraging the model to remember phrases and thus improving the accuracy of phrase translation.Meanwhile,the phrase dropout mechanism randomly drops target phrases during training,enhancing the model’s robustness to inaccurately translated phrases and mitigating their impact on subsequent phrase translation.Experimental results show that phrase-aware adaptive training and phrase dropout mechanism effectively improve phrase translation accuracy and overall translation quality.Further experiments demonstrate that phrase-aware adaptive training can effectively transfer the phrase knowledge of the teacher model to the student model,providing a solid foundation for subsequent research on interlayer knowledge transfer methods for phrase models.(2)We propose a neural machine translation method based on phrase attention.Empirical analysis results indicate that traditional neural networks have certain limitations in capturing long-distance dependencies.Therefore,this study aims to assist the model in better understanding long-distance dependencies by modeling larger translation units(i.e.,phrases).Firstly,we apply the phrase segmentation method from phrase-aware adaptive training to split sentences in the training set into phrases.Then,we incorporate phrase information into the neural machine translation model through a phrase attention mechanism to enhance the model’s ability to capture long-distance dependencies.In addition,this study employs contrastive learning strategies to optimize the phrase representations generated by the encoder,aiming to make the distribution of phrase representations more balanced and thereby strengthen the model’s ability to model source-side phrases during translation generation.Experiments show that the translation quality of the neural machine translation model based on phrase modeling is significantly improved,with better performance on long sentence translation,further confirming the practicality and effectiveness of this method.(3)We propose an interlayer knowledge transfer method for phrase models.In neural machine translation models,different layers of the network reflect different abstraction levels of information,with higher layers containing richer information.This study presents a phrase model-based interlayer knowledge transfer method,aiming to transfer abstract knowledge about phrases and words from higher layers to lower layers,thereby improving the training stability of lower layers and enabling them to capture more useful information.Specifically,based on the previously proposed phrase attention neural machine translation model,we adopt a knowledge distillation method to pass the phrase and word attention information from the higher layers to the lower layers,optimizing the lower layers’ abstraction capabilities for words and phrases.At the same time,we employ phrase-aware adaptive training to transfer richer phrase constraints from higher layers to lower layers.Notably,the teacher model and the student model in this method are the same,which differs from the distinct models used in traditional knowledge distillation methods.Experimental results show that the interlayer knowledge distillation method based on phrase models effectively improves the translation quality of neural machine translation models,confirming the effectiveness of this method. |