Font Size: a A A

Design And Implementation Of Electronic Medical Record Information Extraction System

Posted on:2020-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:X H NiFull Text:PDF
GTID:2404330611454933Subject:Software engineering
Abstract/Summary:PDF Full Text Request
After decades of using and improvement,the electronic medical record has become increasingly sophisticated.Both structured and unstructured forms of recording are the main storage formats for medical records.Unstructured formats facilitate the expression of various medical concepts and events,and are by far the main form of clinical treatment.At present,multiple systems in hospitals need to extract various relevant data from the course of EMR text.In order to solve this problem,this paper designs a program that can automatically and flexibly extract medical named entities and entity relationships from EMR.The software performs word segmentation and part-of-speech tagging on the course of EMR;then it uses template and machine learning techniques,in which rules are used for textual information with significant linguistic features,and SVM is used to personalize textual information that describes linguistic features.The experimental analysis found that SVM has better extraction effect than the rule alone,and the rule also provides a good auxiliary function for feature learning.The paper mainly includes the following contents:1)To achieve EMR information extraction,we first need to solve the Chinese word segmentation.Then ICTCLAS was selected as a word segmentation tool after determining the needs and objectives.Based on GATE,we gradually realized the segmentation,syntactic segmentation,grammar marking,vocabulary collection,and rule definition of the batch course of our hospital.2)For the data with significant linguistic features in the EMR text,the JAPE rules and the methods for collecting medical vocabularies are mainly used for extraction research.3)For the information of personalized description and linguistic features in EMR,the support vector machine is used to study the recognition of contextual features,linguistic features of words,semantic features,etc.on large-scale EMR data.For entity recognition,the entity itself and the lexical language features surrounding it are used.The relationship extraction involves the respective NLP features of the two entities and the combination of the two entities.The co-occurrence of each pair of entities is realized by the identifier.For the small-scale training set,there are imbalance problems with less positive cases and more negative cases.SVM combined with uneven edge algorithm is used to improve the text classification effect by using large edge parameter method.4)We Adopt object-oriented development method,three-layer structure B/S software model,Visual Studio 2013,SQL 2008 computer software development technologies to design and implement electronic medical record information extraction system,then apply it to hospital EMR environment.The test results show that the EMR text information extraction system can meet the clinical needs of daily hospitals,and provide a basis for facilitating text data query and further application.
Keywords/Search Tags:electronic medical record, information extraction, entity, entity relationship, machine learning
PDF Full Text Request
Related items