Research On Neural Machine Translation System For Electrical Engineering

Posted on:2023-02-17

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Zhang

Full Text:PDF

GTID:2532307034482874

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

At present,the research on English-Chinese machine translation in specific professional fields is often limited by the lack of corpus resources,difficulty in obtaining,and lack of authoritative professionalism,which has brought great resistance to the development of machine translation in various professional fields.Aiming at the characteristics of texts in the field of electrical engineering,this paper takes the neural machine translation model based on attention mechanism as the baseline model,proposes different embedding layer parameter initialization methods,and improves the structure of the model,so as to improve the translation effect of the model in the electrical engineering field.The main work of this paper is as follows:1.Propose different embedding layer parameter initialization methods.In view of the scarcity of parallel corpus in English-Chinese machine translation in the field of electrical engineering,an improved embedding layer parameter initialization method fusing domain term information was proposed on the basis of using general corpus to train the translation model.Firstly,the term words were divided into a minimum unit by word segmentation preprocessing for the text.Then,two word vectors trained by Glove and Word2 vec on different monolingual corpus were used to initialize the vector representation of common words and term words in the embedding layer parameters respectively.Finally,the term dictionary was used to search and replace the out-ofvocabulary words,which alleviates the serious problem of unknown words caused by terminology in the process of translation.The neural machine translation model based on attention mechanism was used as the baseline system for experiments.The results show that the translation performance of the improved model on the test corpus in the electrical engineering field is improved by 2.713 BLEU points.2.Propose an improved encoder method for enhancing source language representation.Initializing the encoder embedding layer of the end-to-end model with pre-trained word embeddings under low resources is a practical trick for neural machine translation to enhance source language representation.The general method is to train the word embedding on large-scale universal monolingual data and use it to initialize the RNN neural machine translation embedding layer.However,for a specific translation task in the field of electrical engineering,the word embeddings obtained by this training lack pertinence,which may easily lead to problems in the translation results,such as language ambiguity,professional vocabulary errors and omissions,etc.Therefore,this paper improves the encoder design of the model.By adding the pre-training layer and residual mechanism,the model can learn better source language representation based on the bilingual training corpus in this field.And four different network structures were used as the pre-training layer for experiments.The results show that the translation performance of the model in the electrical engineering field is improved by 0.539～1.94 BLEU.

Keywords/Search Tags:

Neural machine translation, Electrical engineering field, Terminology information, Embedding layer parameters, The pre-training layer, Encoder

PDF Full Text Request

Related items

1	The Application Of Nida’s Concept Of Dynamic Equivalence In The MEP Terminology Translation
2	Layer-by-layer Self-assembly Of Conducting Multilayers And Research On Their Electrical, Electrochemical, And Photovoltaic Properties
3	Layer Layer Self-assembly Of Inorganic - Organic Nano-functional Composite Membrane
4	Analysis And Design Of Dual-stator HTS Exciting Field Modulation Machine
5	Modeling And Simulation Of Freeway Incident Detection Based On Information Fusion
6	The Research On Under-sealed Layer As Transition Pavement During The Winter
7	Structural Analysis Of The Giant Spiral Case With Different Embedding Manners
8	The Variation Of A Hydrogen Bonding Directed Layer-by-layer Self-assembled Multilayer Films In A Base Aqueous Solution
9	Research Of The Epitaxial Growth Of Buffer Layer And Interaction Mechanism Between Buffer Layer And Superconducting Layer
10	Fabrication Of Patterned And Free-Standing Layer-by-Layer Assembled Multilayer Films