Autism spectrum disorder(ASD),is a complex set of neurodevelopmental disorders that affect a person’s ability to socialise,behave and communicate.With the development of MRI technology,methods of diagnosing autism through MRI data of the brain are being explored.Most of the current autism classification studies based on sMRI data are based on single-centre data,but there are often more significant inconsistencies between their findings.This is because small sample studies do not yield experimental results with generalisability for a highly heterogeneous neurological disorder such as autism,so cross-centre large sample MRI studies of brain disorders should be a priority research direction.However,existing studies have shown that the classification accuracy of multi-centre autism data is usually very low.In this paper,we first identified the main reasons for the low classification accuracy of multi-centre autism data through literature reading and experimental analysis:(1)the batch effect of multi-centre data;(2)the heterogeneity of the brain among different individuals;(3)the redundancy of existing structural brain region features;and(4)the insufficient recognition ability of a single classifier.Based on the above findings,this paper proposes an autism classification model based on structural brain clustering for multi-centre large sample of sMRI data,with the following main research and contributions:(1)In this paper,we found that the main reason for the significant decrease in the accuracy of multi-centre classification compared to single centre classification is the batch effect of sMRI data due to the different acquisition sites,so this paper uses a relatively good performance method to remove the batch effect-ComBat algorithm,using this algorithm to effectively reduce the batch effect of multi-centre sMRI data.The batch effect of multi-centre sMRI data was effectively reduced using this algorithm.The classification accuracy was improved by about 5.6% compared with the results before data correction.(2)In this paper,a summary of existing studies reveals that autism is highly heterogeneous,which is an important reason for the significant inconsistency in structural brain results observed in the existing literature and a major obstacle to autism identification.This chapter therefore focuses on exploring whether populations with the same neuroanatomical structure have more consistent autistic features to better increase autism recognition.Ultimately this paper uses an improved FCM clustering algorithm to identify four brain subtypes for each specific cluster for the classification of autism.The experimental results show that identifying different brain subtypes by sMRI can reduce inter-individual brain heterogeneity and improve the accuracy of autism identification within subtypes.The classification accuracy of each subtype was improved by up to 9.84% compared to the overall results without clustering.(3)For the feature engineering part,in order for the model to learn the autistic brain features of people with different neuroanatomical subtypes,this paper performs feature selection for each subtype separately,and the experimental results show that the classification accuracy is improved by up to 1.74% after feature selection.(4)For the ensemble classifier part,the optimal classifier trained for each subtype was ensemble with the prediction results of the base classifier using Soft Voting’s ensemble learning method.Regarding the determination of the base classifier weights,this paper uses the soft division feature of the FCM algorithm to combine it with the ensemble learning method of Soft Voting to weight the prediction results of the base classifier according to the probability of the sample being divided into each subtype.From the experimental results,it can be seen that the prediction results of the ensemble classifier based on Soft Voting improve the classification accuracy by about3.32% compared to the results without ensemble learning.Finally,experiments on the ablation of each module and comparisons with other algorithms were conducted on the multi-centre autism dataset in this paper,and the experimental results demonstrate that the model proposed in this paper has excellent performance on the multi-centre autism data classification problem.In this paper,the whole model is ensemble into an application tool for autism prediction,which can import data and output prediction results.For multi-centre autism sMRI data,the model proposed in this paper is able to achieve an overall classification accuracy of75.68%,which is an 18.95% improvement in accuracy compared to classification using a traditional single classifier. |