Font Size: a A A

Research On Syndromes Classification Model Of Chronic Fatigue Syndrome Based On Multi-label Learning

Posted on:2019-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:H H XiaFull Text:PDF
GTID:2404330566972837Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a modern disease,Chronic Fatigue Syndrome(CFS)has become a serious problem that puzzles humanity today.In the view of the fact that western medicine faces challenges of the pathological study of CFS.Traditional Chinese Medicine(TCM)syndrome differentiation and treatment methods can provide a useful reference for the diagnosis and treatment of CFS.However,subjectivity in the process of differentiation of TCM affects the correctness of the disease' diagnosis.In order to explore scientific auxiliary diagnosis methods,most of the existing works employ data mining to handle TCM data.In this way,potential information can be extracted guide TCM syndrome differentiation.In this paper,we take the patients' symptoms and corresponding syndromes as samples.In the samples,the symptoms information of the patients are as features and the syndromes are as labels.The multi-label learning method is introduced to study the TCM syndromes classification model,providing a useful reference for the diagnosis of the disease.In order to reduce the influence of the irrelevant and redundant features of the CFS dataset on the classification,this paper proposes a Multi-Label Feature Selection Algorithm based on max-dependency and min-redundancy by using the multi-label.In addition,a CFS Syndromes Classification Algorithm based on random walk and conditional random field is proposed,which makes full use of the labels and features information of the dataset.The main work of this paper is as follows:(1)Aiming at a large number of irrelevant and redundant symptoms features included in CFS data,we propose a Multi-Label Feature Selection Algorithm based on max-dependency and min-redundancy(LR-MDMR),using the correlation between labels.Our algorithm uses the mutual information between syndromes to describe the correlation between labels(the redundancy of label information).The reciprocal of label information redundancy is used as the weight of labels to reduce the dependency of feature on related labels.Information redundancy between features not only involves the redundancy between individual features,but also considers the pairwisedependency between features for each class label.LR-MDMR algorithm selects the subset of features which maximizes mutual information between the feature and the labels set and minimizes mutual information between features.Experimental results show that,compared to the existing feature selection algorithms,the LR-MDMR algorithm reduces the One Error,the Hamming Loss and the Coverage from 3.1% to11.6% and improves the Average Precision from 4.0% to 7.9% on the public datasets.Compared to the original dataset,the LR-MDMR algorithm has a good performance in a variety of evaluation criteria on the CFS dataset.In short,our LR-MDMR algorithm effectively selects the features more related to labels,which can provide data basis for standardization research of TCM syndrome classification.(2)The existing classification models of TCM syndromes seldom simultaneously analyzes the relationship between symptoms and syndromes and syndromes in specific diseases and affects the accuracy of classification.To solve this problem,this paper proposes a CFS Syndromes Classification Algorithm based on random walk and conditional random field(CRWCRF).The CRWCRF algorithm considers the influence of feature on different labels and proposes feature importance,predicting the label probability of different samples in different label subgraphs.The CRWCRF algorithm constructs a conditional random field with the label prediction probability and different labels mutual information to obtain the final label set of the prediction sample.Experiments on public datasets and CFS dataset show that,the CRWCRF algorithm is better than the existing algorithms on the evaluation criteria,such as One Error,the Hamming Loss and so on.It can provide a reference for the diagnosis of the CFS disease.(3)Based on LR-MDMR algorithm and CRWCRF algorithm,we design and implement a prototype system for CFS classification and prediction.The system can not only select the key symptoms of CFS,but also distinguish the patient's syndromes according to the patient's four diagnosis information,and assist the doctors to quickly determine the patient's disease.
Keywords/Search Tags:TCM syndromes classification, Multi-Label, Feature Selection, Random Walk, Conditional Random Field, Chronic Fatigue Syndrome
PDF Full Text Request
Related items