Font Size: a A A

Research On ICD Intelligent Coding System Based On Machine Learning

Posted on:2021-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2504306503975419Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Since the International Classification of Diseases(ICD)was published,it has gradually become a standard in the world.According to the World Health Organization,70% of the world’s medical records are encoded by ICD.At present,the coding work of hospitals in China mainly depends on professional coding personnel.However,due to the large population and the complexity of diseases,the efficiency and quality of ICD coding are low,which seriously affects the strategic process of Healthy China.The purpose of this paper is to realize the intelligent coding of ICD by natural language processing and machine learning technology,and realize the multi-label classification of ICD coding with considering that each patient may have multiple diseases.Firstly,this paper adopts the classifier chain model,and fully mines the correlation between labels,and integrates different traditional machine learning models by ensemble learning.Experiments show that,compared with the binary relevance model without considering correlation between labels,the classifier chain model has obvious performance improvement,meanwhile,compared with the single traditional machine learning model,ensemble learning could also achieve better results;secondly,this paper extends the principle of classifier chain model to deep learning,proposes CNN-LSTM model based on self-attention mechanism,in which,convolutional neural network is used to extract text features,and recurrent neural network is used to realize the principle of classifier chain.Finally,self-attention mechanism is used to further mine the correlation between labels.Experimental show that,in case of without considering the correlation between labels,the performance of deep learning model and classifier chain model based on ensemble learning is almost the same,while the performance of deep learning model is significantly improved when using LSTM and self-attention mechanism to mine the correlation of labels.
Keywords/Search Tags:ICD code, Machine learning, Neural language processing, Multi-label classification
PDF Full Text Request
Related items