Font Size: a A A

Research And Application Of Chinese Named Entity Recognition

Posted on:2023-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y L WangFull Text:PDF
GTID:2558306914477724Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Named Entity Recognition(NER)is a critical task in natural language processing,and this task aims to identify entity locations contained in text sequences and classify them into specific entity types,such as organization,location,title,etc.At the same time,NLP tasks such as information extraction,question answering,and relation extraction also require NER to provide entity information as an upstream task.Models represented by deep learning have achieved reliable results in NER tasks,however,compared with English NER,Chinese NER tasks are more complicated,with regular changes in English tenses,singular and plural,etc.,because of the requirement of word segmentation in Chinese text,there exits several problems in Chinese text such as blurred word boundaries,complicated entity structure and diverse expressions.Therefore,this thesis will propose optimization improvements for the NER algorithm in the Chinese context,then apply the task model to practice.The main work of this thesis is as follows:1.A Chinese named entity recognition encoder(Multi-head Lattice LSTM)based on Lattice LSTM is proposed.The Lattice structure introduces word segmentation information on the basis of word-level features.We divide the Lattice LSTM into multiple channels to calculate features and concatenate them,then send them into structures such as fully connected layer,residual and normalization layer,so that the Lattice LSTM is integrated into the Transformer block,which is convenient for The network obtains more accurate and sufficient feature information.2.Propose an attention fusion layer structure.In order to reduce the defect that LSTM can only extract unilateral information and the attenuation of positional encoding information with the deepening of the network,we use a self-attention layer that introduces relative positional encoding to obtain the attention features of the hidden vector to the complete sequence,finally we use a learnable high-speed network gating to decide which attention features need to be retained or forgotten,making the network more effective.3.Developed and completed the Chinese NER user interface program.In order to facilitate the application of entity recognition by nonresearchers,users in this window program can freely load the required pretrained NER model,write the required recognition text through the input window,then obtain the predicted entity in the output window after confirmation.The method proposed in this thesis is experimented on three common Chinese NER datasets.After experimental comparison,the experimental results of the Multi-head Lattice structure,the feature fusion layer and the overall model structure have been improved,and the Chinese named entities can be extracted and classified more accurately.
Keywords/Search Tags:Lattice LSTM, Transformer block, self-Attention, highway networks, Chinese named entity recognition
PDF Full Text Request
Related items