Research On Short Text Classification Algorithm Of Obstetric Electronic Medical Record Based On BERT And CNN

Posted on:2021-03-22

Degree:Master

Type:Thesis

Country:China

Candidate:L Zhang

Full Text:PDF

GTID:2404330647960153

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Obstetrics electronic medical records,as the main channel for doctors to fully understand the situation of pregnant women and fetuses,are of great significance for improving the reproductive health of the population.The structured processing scheme is an important method for information mining of unstructured text in electronic medical records,which improves the efficiency of medical staff.As a key module in the structured function,text classification plays a crucial role in the final structured effect.The rapid development of deep learning technology brings more possibilities for the solution of text classification tasks.It is of great practical value to study how to combine new technologies with existing solutions to further improve the accuracy of existing solutions.This article uses the text data set of the delivery record in the obstetric electronic medical record to propose a short text(sentence level)classification algorithm for the six categories in the delivery record.The algorithm is improved in the following three aspects:(1)The BERT pretrained language model is used to feature the vectorized representation of the sentence,which avoids the problem of traditional Chinese word vectors relying heavily on the word segmentation algorithm,and improves the ability of the feature vector to express the text context;(2)The obstetric medical record text has the problems of irregular text writing and difficult to divide the sentence boundaries.In this paper,the sequence labeling method based on Bi-LSTM-CRF is applied to the sentence segmentation task,which enhances the sentence segmentation ability of the data preprocessing stage;(3)The use of a convolutional neural network containing multiple layers of convolution as a model classifier enhances the model's ability to extract upper-level features.The experimental results show that the BERT + CNN network model proposed in this paper has an F1 value of 94% in the text classification task of obstetric electronic medical records,which is about 6% higher than the benchmark model Text CNN,and the F1 difference can reach 10% on fewer data sets;The F1 value of the sentence segmentation algorithm reaches about 80%,and the use of Bi-LSTM + CRF has a better effect.This article uses the most popular technology in the field of natural language processing in recent years to improve traditional text classification,provides more solution options for structuring,and provides reference and reference for future related research.

Keywords/Search Tags:

BERT, Short text classification, Sequence annotation, Obstetric electronic medical records

PDF Full Text Request

Related items

1	Research On Multi-label Classification For Obstetric Electronic Medical Records Based On Knowledge Fusion
2	Design And Implementation Of Intelligent Diagnosis Guidance System Based On Deep Learning
3	Clinical Named Entity Recognition From Chinese Electronic Medical Records Using A Double-layer Annotation Model
4	Research On Named Entity Recognition Of Electronic Medical Records Based On BERT Model
5	Research On The Storage Method Of Electronic Medical Records Based On Graph Database
6	Research Of Intelligent Hepatopathy Auxiliary Diagnosis System Based On Text Semantic Analysis Of Electronic Medical Records
7	Research On Clinical Electronic Medical Record Analysis Technology Based On K-BERT
8	Research On Recognition Of Medical Time Expressions And Events In Chinese Electronic Medical Records
9	Research On Medical Semantic Network Construction Method Based On Chinese Electronic Health Records Text
10	Analysis Of Radiological Report For Clinical Decision Support