Font Size: a A A

Research On Cancer Detection Technology Based On Metabolomics Mass Spectrometry Data

Posted on:2024-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:X P LiuFull Text:PDF
GTID:2544307103474464Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Early detection of cancer is currently an important direction in medical research,as early detection of cancer can substantially improve the cure rate and survival quality of patients.As an emerging research method,metabolomics can be used to analyze the effects of cancer on cellular metabolic status,explore the metabolic mechanisms of cancer development and progression,and find metabolic differences to achieve early cancer diagnosis.Mass spectrometry is an important tool in metabolomics research,which can detect abundant metabolites due to its high sensitivity and high resolution,providing the possibility to discover cancer biomarkers.The data obtained based on mass spectrometry have problems such as missing values and high-dimensional features that are unfavorable for cancer detection and biomarker identification,but with the increasing maturity of machine learning algorithms and neural network algorithms,there are new ideas for solving the difficult problems in mass spectrometry data.Based on this,this paper proposes a method for cancer detection using metabolomics mass spectrometry data combined with machine learning algorithms for the research objectives of cancer detection and cancer biomarker discovery,and carries out the following studies:(1)A sample similarity-based missing value interpolation algorithm,SCKRI,is proposed.missing value generation is simulated on two published cancer metabolomics datasets,and missing value interpolation experiments are performed.The results of SCKRI,Mean Imputation(Mean),k-Nearest Neighbor Imputation(KNNI),Random Forest Interpolation(miss Forest)and Multiple Imputation(MI)were compared.The experimental results show that the normalized root mean square error NRMSE of the data interpolated by the SCKRI algorithm is the smallest compared with the original data set.(2)A network model based on Variational Auto Encoder(VAE)is constructed as a feature extractor to combine with other machine learning classifiers for cancer detection.We compare the classification results of four classifiers,Logistic Regression(LR),k-Nearest Neighbor(KNN),Support Vector Machine(SVM),and Random Forest(RF),with and without using VAE as a The classification effects of these four classifiers with and without VAE as the feature extractor.The experimental results show that the features extracted using VAE show better classification results on the four classifiers,verifying that it is feasible to use VAE for cancer detection.(3)A feature selection algorithm mVAE-FS based on the weights of the variational self-encoder network is proposed to select a subset of features that have good discriminatory ability between cancer samples and healthy control samples.The experimental results show that the classification accuracy of the feature subsets selected by the mVAE-FS algorithm is 95.26%,94.28% and 91.22% on the three data sets,respectively.Using the mVAE-FS algorithm to obtain features and perform biomarker point selection among these features,10 biomarker points were selected for classification evaluation,with classification accuracies of 93.24%,91.52% and89.62% on the three datasets.
Keywords/Search Tags:cancer detection, metabolomics, mass spectrometry, feature selection, Variational Auto Encoder
PDF Full Text Request
Related items