Font Size: a A A

Application Of Topic Model In Text Mining Of Military Medical Service Information

Posted on:2019-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y S ZhaoFull Text:PDF
GTID:2416330563955863Subject:Military logistics
Abstract/Summary:PDF Full Text Request
Objective:By process of text mining of chief inspection conclusions in physical examination reports descripted by natural language,this study aims at providing technical support for text mining of military medical physical examination reports in the future.Methods:Taking the chief inspection conclusions in physical examination reports of 36712 staffs as an example,medical terminologies of abnormal clinical results in chief inspection conclusions was recognized and extracted by using Hidden Markov Model(HMM)in R programming language environment.This study evaluated average performance of HMM by cross-validation.A lexicon database of abnormal clinical results was constructed by recognizing and extracting medical terminologies of abnormal clinical results in chief inspection conclusions.Abnormal chief inspection conclusions were clustering analyzed by using topic model.Results:(1)The cross-validation average performance of evaluation of HMM which was used to recognize and extract medical terminologies achieved that precisions: 91.79%,recalls: 80.31%,F1-score: 85.64%.(2)Lexicon database of abnormal clinical results contained 2328 medical terminologies in total.The number of false positive descriptions of abnormal chief inspection conclusions was 413.After the process of standardizing,lexicon database of abnormal clinical results remained 791 medical terminologies.(3)This study investigated abnormal chief inspection conclusions by using word frequency analysis method.The male top five medical terminologies of abnormal clinical results were “Triglycerides”,“Fatty Liver”,“Overweight”,“Hypertension”,“Helicobacter Pylori Positive”,the female top five medical terminologies of abnormal clinical results were “Helicobacter Pylori Positive”,“Triglycerides”,“Uterine Cervicitis”,“Fibrocystic Disease of Breast”,“Cervical adenocele”.(4)The Latent Dirichlet Allocation(LDA)topic model which adopted Gibbs sampling algorithm(Gibbs)had the best performances.Based on topic model,the abnormal chief inspection conclusions were clustered into different topics composed of medical terminologies.According to ages,the male ones were clustered 3 topics: “Dyslipidemia,Fatty Liver”,“Cardiovascular and Cerebrovascular Diseases”,“Metabolic Abnormalities,Helicobacter Pylori Positive”,the female ones were clustered 3 topics: “Uterine Cervicitis,Fibrocystic Disease of Breast”,“Malnutrition”,“Leiomyoma,Atherosclerosis”.According to working locations,the male ones were clustered 6 topics: “Atherosclerosis,Fatty Liver”,“Dyslipidemia,Helicobacter Pylori Positive”,“Dyslipidemia”,“Hypertriglyceridemia,Helicobacter Pylori Positive,Fatty Liver”,“Hypertension,Fatty Liver”,“Dyslipidemia,Pharyngitis”,the female ones were clustered 6 topics: “Uterine Cervicitis,Dyslipidemia”,“Fibrocystic Disease of Breast,Pharyngitis”,“Dyslipidemia”,“Helicobacter Pylori Positive,Fibrocystic Disease of Breast”,“Helicobacter Pylori Positive”,“Gynecologic Inflammation”.Conclusions:(1)The method of constructing a lexicon of abnormal clinical results in this study can establish foundation to construct a lexicon of abnormal clinical results for the text data of military medical physical examination reports.(2)The topic model not only provides a method of clustering analysis for the text data of military medical service information,such as chief inspection conclusions in military physical examination reports in peacetime,medical tags and battlefield medical records in wartime,but also exhibits vividly similarity between different topics by the pictures of cosine value of correlation matrix of different topics,and high frequency key medical terminologies of different topics by visual word cloud of topic.
Keywords/Search Tags:Military medical service information, Medical information management, Text mining, Topic model, R programming language
PDF Full Text Request
Related items