Design And Implementation Of Intelligent Text Annotation Platform Combined With Active Learning

Posted on:2023-10-20

Degree:Master

Type:Thesis

Country:China

Candidate:Q Q Lv

Full Text:PDF

GTID:2558307070984119

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

With the continuous development of the Internet,there is an explosive growth trend of information on the network,which usually exists in the form of unstructured text.In order to effectively utilize these unstructured texts,it is necessary to annotate the key information in them.Due to manual labeling,there are problems such as inefficiency,cumbersomeness,and prone to labeling errors.However,existing annotation tools provide limited support in controlling data quality and reducing annotation workload.Therefore,the current manual annotation requires a lot of cost,and it is difficult to ensure the quality of the labeled data.This paper conducts research under this background,and the main work includes the following two aspects:(1)In view of the shortcomings of the existing annotation tools in terms of operation convenience,annotation quality,and annotation efficiency,this paper designs and implements a convenient and easy-to-use text annotation platform.The platform not only realizes the basic functions of annotation tools,but also provides data quality analysis functions to control the quality of annotation data and auxiliary annotation functions to reduce the workload of annotation personnel,and provides support for online model training.This paper compares the time cost of labeling the same data with and without the auxiliary labeling function.The experimental results show that the auxiliary labeling function designed in this paper can effectively improve the labeling efficiency.(2)In view of the lack of support provided by traditional annotation tools in selecting the text-first annotation that can improve the performance of the model,this paper proposes an auxiliary annotation method based on active learning,which mainly evaluates the uncertainty of text relative to the model through the mean value of text information entropy.The higher the uncertainty,the greater the performance improvement of the model.The method first predicts the character-level label probability of unlabeled text through a deep learning model,then calculates the average information entropy of all unlabeled text on this basis,and finally sorts the unlabeled text according to the uncertainty of the text relative to the model.So that users can preferentially label text with high uncertainty relative to the model.In this paper,the effectiveness of the method is proved by comparative experiments on the boson dataset.Compared with other methods,selecting the sample training model by the mean information entropy can improve the model performance more quickly,and then make the model prediction effect better.

Keywords/Search Tags:

Active learning, Text annotation, Information entropy mean, Named entity recognition

PDF Full Text Request

Related items

1	The Research Of Weibo Entity Recognition Model Based On Active Learning
2	The Research On Named Entity Recognition In Chinese Information Processing
3	Research On Named Entity Recognition Methods For Unstructured Text
4	Automatic Approaches To Develop Large-scale TCM Electronic Medical Record Corpus For Named Entity Recognition Tasks
5	Research On Tibetan Named Entity Recognition Model Based On Active Learning
6	Named Entity Recognition Method For Labeling Scarce Problem
7	Research Of Entity And Relation Extraction Based On Text
8	Reaserch On Named Entity Recognition For Web Recruitment Text Based On Deep Learning
9	Research On Chinese Named Entity Recognition Based On Deep Learning
10	Research On Named Entity Recognition And Entity Link Method For Short Text Questions