| Metabonomics is the branch of science concerned with the quantitative understandings of the metabolite complement of integrated living systems and its dynamic responses to the changes of both endogenous factors and exogenous factors. NMR-based metabonomics which uses nuclear magnetic resonance as the main analysis technology has been widely used in many fields,including drug research, molecular physiology,molecular pathology,genomics,nutrition,environment science.Due to the nonlinear,high dimensionality and small sample size characteristics of metabonomics data,it is a key problem to development new data analysis methods in line with these characteristics and with certain universality.This paper proceeded with data pretreatment and data analysis.In our work,a novel adaptive binning method based on statistical discrepancy and a multivariate statistical analysis method based on non-negative matrix factorization(NMF) were introduced.The main results are summarized as follows:First,with substantive literature,make a brief overview of common pattern analysis methods,and prospect the development trend of metabonomics data analysis.Second,a novel adaptive binning method based on statistical discrepancy was proposed for data pretreatment.A function is constructed to describe the statistical discrepancy of metabonomics data.Then the data matrixes are integrated with the integral interval designed adaptively based on the statistical discrepancy of variables. Both simulated NMR data and experimental spectra from dietary intervention individuals were employed to validate the performance of the adaptive binning.It was showed that the accuracy of sample classification and characteristic biomarkers identification can be improved effectively by the proposed binning method.Third,non-negative matrix factorization(NMF) was applied to the NMR-based metabonomics pattern analysis.Detail comparisons were made between NMF and the most conventional method principal component analysis(PCA) by employing the two methods to discriminate the urine and serum spectra of diabetes 2 patients from healthy controls and identify the potential biomarkers.It was proved that the special advantage of NMF such as the non-negative constraints and the part-based representation are more feasible for detecting small concentration biomarkers.It could be concluded that,the adaptive binning method based on statistical discrepancy and the multivariate statistical analysis method based on NMF are effective tools for pattern analysis and characteristic biomarkers identification in metabonomics research. |