Research On Feature Construction Algorithm For Lifespan Estimation Problem Based On Autoencoder And Transformer

Posted on:2024-05-07

Degree:Master

Type:Thesis

Country:China

Candidate:Z Wang

Full Text:PDF

GTID:2544307064997179

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

When cancer patients and clinicians work together to make treatment decisions at this stage,they focus on the factor of the length of survival.Most existing studies investigate the risk of survival or recurrence of cancer patients after a specific period(e.g.,1 year or 5 years),but do not provide a more specific understanding of cancer patients’ survival period.With the rapid development of modern high-throughput technologies,biomics data are increasingly being publicly released and applied to a variety of diseases,such as cancer.A large number of studies have been conducted using DNA methylation datasets to find clinical associations between DNA methylation biomarkers and tumours.However,these datasets are not conducive to model training and subsequent studies due to the "large p small n" problem where the number of features is much larger than the number of samples.To predict the specific survival time of patients and solve the problem that the feature dimension is much larger than the number of samples,this paper proposes a SLOGAN model based on an autoencoder and Transformer,which combines feature selection and feature construction,and performs specific survival time prediction on the final selected feature subset.The main purpose of performing feature selection in this work is to reduce redundant features and data noise,and reduce computational overhead while improving the accuracy of prediction.However,since feature selection cannot generate new features,it cannot improve the quality and information abundance of the features themselves.Therefore,this work introduces an autoencoder and Transformer-based feature construction method to map the original features to a new space,thus improving the prediction performance of the model on a subset of features.Meanwhile,the idea of adversarial learning is combined in the part of feature construction by using an ordinary autoencoder in the generator part and adding a Transformer mechanism in the discriminator part,and a loss function sparse loss is proposed to assist the training of the model and increase the quality of constructed features.And in the process of model construction,we use the idea of adversarial learning to "confront" the input and output of the generator,so that the generator can learn the information of the original features better,and the constructed features can show good performance in prediction.In this paper,we use 10 datasets from TCGA database and design six experiments based on the above datasets,including the selection of the number of cycle construction,the selection of the number of hidden layer nodes,the verification of the necessity of feature construction,the experiment of feature selection method comparison,the experiment of feature construction network model comparison,and the experiment of feature construction network model dissolution.By comparing the prediction performance under different cycle construction times,it can be found that the quality of constructed features is not better with more cycle construction times.To select the appropriate number of intermediate hidden layer nodes,this paper compares the construction results of the SLOGAN algorithm under different intermediate hidden layer nodes.The necessity of feature construction can be verified by comparing the regression prediction performance of feature subsets on the model after feature selection only with that after feature selection and feature construction.In this paper,by comparing the SLOGAN feature selection method with existing feature selection algorithms and the SLOGAN feature construction model with existing neural network models,it is demonstrated that the features generated by the SLOGAN algorithm are superior.To verify the effectiveness of each part of the neural network in the feature construction part of the SLOGAN algorithm,ablation experiments are conducted in this paper to compare the performance of the features constructed by removing different parts of the network on the regression prediction model.The experimental results show that the new features constructed by the SLOGAN algorithm achieve better prediction performance in the regression prediction problem,and demonstrate the necessity of feature construction and the indispensability of each part of the feature construction model as found by the ablation experiments.

Keywords/Search Tags:

DNA methylation, Feature selection, Feature construction, Autoencoder, Generative adversarial networks

PDF Full Text Request

Related items

1	Research On Method Of 3D Segmentation Of Liver Based On Conditional Generative Adversarial Neural Networks Using Feature Reduction
2	Research On Diagnosis Method Of Pulmonary Nodules Based Generative Adversarial Networks
3	A Research On Segmentation Of Prostate MRI Image Based On Generative Adversarial Networks
4	Research On Low-dose CT Image Noise Reduction Method Based On Improved Generative Adversarial Network
5	A Study Of Gene Selection Method Based On Generative Adversarial Networks And Swarm Intelligence Optimization
6	Research On Cancer Survival Prognosis Prediction Model Based On Deep Learning
7	Research On Denoising Of ECG Signal Based On Generative Adversarial Networks
8	Research On Low-dose CT Denoising Method Based On Generative Adversarial Network
9	Study On The Response Of Drugs Based On Generative Adversarial Networks
10	Research On Key Technologies Of Human Soft Tissue MR Image Matching