| In China,the population of brain diseases and dysfunction is largest and growing continuously.The most typical brain diseases,such as depression,Parkinson’s disease and Alzheimer’s disease,have developed into an urgent social problem to be solved urgently.Accurate identification of brain electrophysiological activity plays a vital role in treating the typical brain diseases timely and effectively.At present,brain science and clinical application rely heavily on non-invasive brain imaging technology,such as electroencephalogram(EEG)and(functional)magnetic resonance imaging(MRI),to examine the dynamic complex systems under observation.Analysis on Pathological brain imaging data is necessary to assist the diagnosis and understand the mechanism of brain diseases.However,there is still a significant gap between the performance of existing analysis technology and diagnosis of brain diseases.The basic theoretical methods in three aspects of feature construction,synchronization measurement and disease identification need to be studied in-depth,which is highlighted in the following three technical bottlenecks:Feature ConstructionMultidimensional pathological brain imaging data(MPBID)records the complexity of brain activity,and it is typically the signal with the notoriously high dimensionality and the enormously complicated interdependencies amongst data elements.Compared with low dimensional data including time series data,multi-dimensional data has sufficient information,which is more suitable for expressing brain activity.Moreover,MPBID like MRI/f MRI are routinely multi-dimensional and intensively non-linear by nature.At present,the traditional methods such as wavelet transform are suitable for two-way data and cannot express the correlation between different dimensions accurately.The traditional PARAllel FACtor analysis(PARAFAC)and Tucker factorization are theoretically suitable for analysing MPBID.However,the factors are conducted via a linear derivation,which leads to large factorization error when processing intensive nonlinearity.Recently emerged Bayesian-based alternatives are heavily relied on the manually selected priori distribution and the empirically set hyper-parameters to ensure the applicability of the approach and the quality of structured features.Synchronization MeasurementThe overall mechanism hidden in the brain may be explored only when the synchronization of the inter-region is measured.Traditionally,linear synchronization measurement methods,such as correlation analysis,cannot accurately measure the synchronization between paired EEG data with intensively non-linear.The current non-linear synchronization measurement methods,such as Hilbert transform,are mainly suitable for extracting the phase information of wideband signals with low noise and low interference.However,it is difficult to accurately extract the phase synchronization patterns of EEG signals superimposed by intensive noise.Recently,information entropy and graph based methods then emerged to measure the synchronization of two possibly heterogeneous matrices in terms of variations in dimension.But in fact,their solution usually depends on the strategies of counting,summing and weighting.The above theories and methods largely ignore the topology of the connectivity matrices,i.e.,the distribution of correlation pattern in the virtual space.This loss of structural information of the inter-region correlation may lead to severe distortion of the analysis.Disease IdentificationThe performance of conventional end-to-end models(e.g.,deep neural network,DNN)often becomes unstable because they largely remain static,while brain states of diseases are highly dynamic and exhibits significant individuality.Traditionally,the hyper-parameters of end-to-end models are tuning manually in terms of expert experience,which the scope and period of application are limited.In addition,the optimization process of stochastic optimization is highly ad-hoc and uncontrollable,which results in the instability of the process.Recently,Bayesian optimization methods are faced with the problems of large hyper-parameter space and indivisibility of optimization model.The complexity of the exponential time makes it difficult to be practical.Faced with the above three major requirements and challenges,this thesis proposes several algorithms based on the theory of high-order tensor factorization and Bayesian optimization,which achieves the following innovative results:(1)Non-linear analysis via deep factorization on multi-dimensional brain imaging data.In order to solve the problem which the large error of linear factorization exists in factorizing non-linear MPBID,this thesis proposes a deep factorization model(namely D-PARAFAC,Section 4.2)to automatically learn the factors(representation)of the original MPBID without the need for supervision of the a priori knowledge or hypothesis conditions,and the idea is initially inspired by the unsupervised representation learning of autoencoders.The general principle for the design of D-PARAFAC is to constructs accurate features from the MPBID itself by an end-to-end learning model.The strategy is(1)first to decompose the MPBID in a non-linear manner without introducing supervision,and(2)second to refine the derived factors with structural information enhanced and stability enforced.Tensor matricization applies to unfold the MPBID into multiple matrices along all(N)dimensions(alias domains and modes).An individual deep CNN factorizes these slices to obtain the Mode-i factor matrix that“deeply” fit the tensor via its built-in non-linear activation function.The N correlated CNNs then jointly derive the overall factor(s)through this forward non-linear fitting in a parallel manner.Factor matrices from the N modes are all aggregated with a Hilbert basis tensor via tensor product.The extended model aims to restore the structural information of the original tensor to the most.The factors are refined in the process of minimizing the error between the reconstructed tensor(from a set of factors in an iteration)and the original one in the same way as the autoencoders do.The final factors are then automatically constructed at the end.(2)Inter-region correlation analysis based on similarity measurement of heterogeneous matrices.In order to solve the problem which the structural information of the brain region is not accurately measured,this thesis proposes a topology-sensitive approach to inter-region correlation analysis based on Heterogeneous Matrix Similarity Measurement(HMSM)to avoid information loss or distortion of the results on similarity.First,the Maximal Information Coefficient(MIC)is utilized to measure the synchronization between the paired channels in terms of non-linearity and robustness to noises.In view of this desirable feature of MIC,this thesis extends the MIC measure to quantify the global correlation of the inter-region,which combines MIC with a correlation matrix,i.e.,Correlation matrix based on MIC(CMMIC).The similarity is measured by deriving the bridge matrix that quantifies the distance from the source matrix to the target one both of arbitrary dimensions.This process is mapped to an optimization problem involving the three parties,which is solved by the HMSM algorithm on the basis of the Gradient Descent(GD)theory.(3)Stable identification of brain disorder via grouping Bayesian optimization.In view of the problem of the difficulty of static model adapting to the nonstationary evolution of brain pathologic state.This thesis proposes a new Bayesian optimization algorithm to quickly optimize the model for EEG classification,namely Grouping Bayesian Optimization.The“Divide-and-Conquer”strategy applied to group the hyperparameters and optimize them within each group.It first partitions the full hyperparameter space into several independent groups according to functionalities.For each group,Bayesian optimization then applies.The next target group is then selected based on the Markov process to ensure the whole model to converge to the global optimum as quick as possible.The method can effectively improve the stability of the model which can continuously improve its own performance to adapt to newly coming data.The experiments of Parkinson’s Disease and depression evaluation have been conducted on MPBID.Results of Parkinson’s Disease evaluation indicate that the information obtained by the proposed method is more than 8% compared with the traditional methods including PARAFAC,Tucker factorization and Bayesian tensor factorization.Furthermore,the method provides the interpretability of structural feature factors to Parkinson’s disease pathology,and matched the discoveries in the existing work from Heiko Professor team of Goethe University Frankfurt(Neurobiology of Aging 24,197-211,2003): the occurrence of Parkinson’s disease is strongly related to the lesions in deep brain area.Results of depression evaluation indicate the measures manifest an appropriate biomarker for its statistically significant capability to distinct the effective group from the non-effective group,which preliminarily lays the foundation for an accurate understanding of the interaction between brain regions in treatment output.Moreover,The performance of the model optimized by GBO is improved by more than4% compared with the one retrained.It executes 3.5 times faster compared with a conventional counterpart.It holds the capability of automatic optimization of hyperparameters of end-to-end model,which preliminarily lays the foundation for rapid and stable monitoring of the state of the brain.In summary,aiming at the technical bottleneck of analysis(auxiliary diagnosis and mechanism research of brain diseases)on MPBID,this thesis studies the basic theoretical methods in three aspects of feature construction,synchronization measurement and disease identification.This thesis developed a deep factorization model to factorize MPBID with the intrinsic non-linearity properly fitted and the solution’s stability ensured.In this thesis,a topology-sensitive approach was proposed to inter-region correlation analysis based on Heterogeneous Matrix Similarity Measurement to avoid information loss or distortion of the results on similarity.Furthermore,a method of Grouping Bayesian Optimization was designed to optimize the hyper-parameter setting.All novel algorithms proposed in this thesis enhance the stationarity of identifying MPBID,which preliminarily lays the foundation for the future application of MPBID. |