Font Size: a A A

Application Of KPCA And Adaboost Feature Selection Algorithms Based On Directed Brain Networks In The Classification Of Alzheimer’s Disease Course

Posted on:2024-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:G S YinFull Text:PDF
GTID:2544307148980909Subject:Basic Medicine
Abstract/Summary:PDF Full Text Request
Objective:With the development of China’s population structure,China is expected to enter a deeply aging society in 2027.Alzheimer’s disease(AD)has become the main cause of the loss of daily living ability of the elderly over 60 in China,bringing huge burden to their families and society.Alzheimer’s disease is an irreversible neurodegenerative lesion characterized by progressive cognitive dysfunction and behavioral impairment,and there is no drug used to clearly reverse cognitive impairment.AD seriously affects the patient’s basic ability to socialize,professionally and live,and when the disease develops,the patient completely loses the ability to live.At present,the diagnosis of AD mainly relies on clinical manifestations,cognitive scale and electroencephalogram.The disadvantages are lack of quantitative standards and delay,mostly in the middle and late stages,which is easy to miss the best intervention time.The methods based on measuring biomarkers are easier to quantify,but they are traumatic and should not be too frequent.With the recent development of molecular imaging-based neuroimaging techniques,MR imaging is being used for automated medical analysis etiological interpretation.Brain networks can simplify the representation of interaction patterns in different regions of the brain,and many neurological diseases often accompany a severe decline in cognitive function,which may be due to the abnormal connectivity of multiple functional brain regions.And in the field of neuroscience,brain network is widely used in the study of brain diseases and mental diseases,including schizophrenia,depression,attention deficit and hyperactivity disorder,etc.,on the basis of brain network,combined with the latest machine learning model of AD imaging data automatic classification,using machine learning auxiliary clinical diagnosis,achieve the purpose of early detection,early intervention,delay the onset.Methods:1.Data collection.The The Alzheimer’s Neuroimaging Initiative Disease Neuroimaging Initiative,(ADNI)is a database focused on imaging data for Alzheimer’s disease,and also covers clinical data from blood,cerebrospinal fluid,and psychiatric scales.This study included 21 subjects in the ADNI database,including 12 AD patients,7 with EMCI,6 with LMCI,and 8 NC individuals.There were no significant differences between the three groups in variables such as gender and age.2.Data handling.Convert image files from DICOM format to NIFTI format using the open-source Python toolkit;Excluding the first few unstable time points at the start of the MR scan,Generally,10-22 time points;Interlayer time difference correction,The reference layer is generally selected as the middle layer;HEMCI motion correction,Excessive EMCI swing can be invalid,The threshold value is1.5mm;Convert the standard brain map images;Gaussian smoothing and denoising;Eliminate the linear drift of the scanning instrument;Selected the 0.01-0.08 Hz as the study frequency band,Using the bandpass filtering technology to eliminate the system noise and the physiological noise;Covariates of age and gender were removed.3.Build a directed brain network.Firstly,the whole brain is divided into 90 regions of interest(ROI)using the AAL brain atlas based on traditional anatomy,this step is implemented with the REST toolbox,then build 90 nodes of the directed brain network;extract the time series of the nodes of interest,and then calculate the Pearson correlation coefficient of the time series of the region of interest and other brain regions.Then the directed edges of the directed brain network are defined by calculating the Granger causal coefficient between different nodes of the brain network.Finally,the threshold of the brain network is defined.CCM calculates the P-value of the time series of any pair of brain regions,and the corrected P-value threshold is obtained by multiple comparison correction of p threshold based on the false discovery rate(FDR).4.Construction of the two classification models.In multivariate statistics,nuclear principal component analysis(KPCA)can realize the non-linear dimension reduction of the data,and better handle the linear inseparability of data sets.In essence,KPCA algorithm is an extension of the nonlinear application of PCA algorithm.KPCA introduces the nonlinear function Φ based on the PCA,and assumes that any vector in space can be expressed linearly represented by all samples in the space.By solving the eigenequation λ v = Cv,we selected samples with a cumulative contribution rate greater than 85% as eigenvalues and eigenvectors.Adaboost Strong integration can be built based on classifiers with weak generalization performance.The essence of the Adaboost algorithm is to re-weight to each training sample according to the sample distribution.The better the classification function,the greater the corresponding weight.Assuming that m weak base classifier is generated after the M round cycle.Enter the cycle m =1: M,the number of categories is R,and finally several weak classification functions are weighted to obtain the strong classification function.To obtain reliable and stable results,we used the 10-fold cross-validation method in this study to evaluate the classification performance in the different models.Results:1.Prediction effect of KPCA and Ada Boost algorithms for different features.Both KPCA and Adaboost models achieved high classification accuracy.In order to confirm that the difference between the two classification models was statistically significant,we tested the binomial distribution of the results with SPSS software(p<0.05).When the two models chose the average change coefficient as a single feature,the classification accuracy was the highest among all features,indicating that the average change coefficient of nodes in each brain region may be the key to identify AD and cognitively normal individuals,which is reflected in clinical medicine may be related to the decreased EEG amplitude and slow α rhythm in the early stage of AD.In addition,the classification performance of the features was further improved after feature fusion,reaching 94.78% and 94.35% respectively.The performance of the two classifier models is close for most features,and KPCA model is better than Adaboost overall,while Adaboost is better than KPCA when features are selected as node strength.2.Prediction effect of KPCA and Ada Boost algorithms for different disease course groups.In this experiment,the subjects of NC,EMCI and LMCI in three courses of diseases were divided into three groups: NC vs EMCI,NC vs LMCI and EMCI vs LMCI.KPCA and Ada Boost algorithm were used to classify three different groups.In these three groups,the classification performance of KPCA algorithm was better than that of Ada Boost algorithm.This result is consistent with our previous classification effect for different feature choices.Overall,the classification model proposed in this study showed high classification accuracy for LMCI group,reaching 85.34% and83.16%,while the classification accuracy for NC group was significantly lower than the other two groups.This may indicate that the two algorithms weakly identify f MRI images in patients with early AD and cognitively normal individuals,which is also consistent with clinical medical observations,indicating that the differentiation of f MRI-BOLD brain cortical signals between normal cognitive individuals and early AD patients is not obvious.Conclusion:1.The topological properties of brain network is more complex,including global properties and local properties,we finally selected the classification features difference significant topological properties as features for classifier training,including: node degree,average rate of change,network connection strength,and features of three characteristics of fusion,fusion features in the two models have reached the best classification performance.2.Two different classification models of KPCA and Ada Boost have achieved good classification performance.The optimal model is the KPCA algorithm under the fusion features of the directed brain network,with an accuracy of 94.78%.3.When the average coefficient of change is selected as a single feature,the classification accuracy is the highest among all single features,indicating that the average coefficient of change of nodes in each brain region may be the key to identify AD and cognitively normal individuals.4.When using the KPCA and the Ada Boost algorithms,the classification performance of the KPCA algorithm was overall better than that of the Ada Boost algorithm.Overall,the classification model in this study had slightly lower accuracy for NC group than the other two groups,and had higher accuracy for LMCI group,reaching 85.34% and 83.16%.
Keywords/Search Tags:Alzheimer’s disease, machine learning, functional magnetic resonance imaging, Kernel principal component analysis, Brain network
PDF Full Text Request
Related items