Font Size: a A A

Research And Implementation Of Medical Text Attribute Extraction System Based On Small Sample

Posted on:2022-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:W K WangFull Text:PDF
GTID:2504306497972599Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Deep learning technology is reshaping various industries.In the field of pathological diagnosis,doctors record the diagnosis results of pathological slices in the form of unstructured text,which contains a lot of valuable information.Attribute extraction system can structure the text,extract novel and effective knowledge to assist doctors in diagnosis and treatment decisions.This paper takes the pathological diagnosis text of colorectal cancer as the research object.There are 11 attributes to be extracted from the text,which can be summarized into two tasks: Text Classification and Sequence Tagging.The deep learning model for attribute extraction often needs a large number of labeled samples.This paper proposes a joint model based on Multitask Learning and Transfer Learning,which focuses on solving the problem of overfitting caused by the small sample size of medical text,and completes the design and implementation of colon cancer text attribute extraction system based on this model.On the one hand,the high cost of medical text data annotation leads to the lack of sufficient training data.The traditional model is prone to overfitting phenomenon.The joint model trains multiple tasks at the same time,thanks to the powerful text feature extraction ability and the sharing of underlying coding information among different learning tasks,which makes each task get implicit data enhancement.Compared with the single-task model,the accuracy of each attribute is significantly improved,reaching more than 95%.On the other hand,medical texts from different data sources have differences in expression mode,description content,and extraction requirements,so it is necessary to build a separate model for each data source.Transfer learning can transfer the model parameters of the source domain to the relevant target domain tasks and fine-tune the network structure in the target domain,It improves the performance of the target domain model.In this paper,we design the data flow model and conceptual model based on the above scheme and the actual needs of pathology department,and realize the attribute extraction system based on B / S architecture.The system can extract 11 attributes from a colorectal cancer diagnosis text at the same time,generate a standardized report,and support offline knowledge migration.Besides,the solution of colon cancer attribute extraction system has reference value for the application of other small medical samples.
Keywords/Search Tags:small sample, joint model, transfer learning, attribute extraction system
PDF Full Text Request
Related items