A Classification Model For Gene Expression Data Of Cancer Patients

Posted on:2024-03-21

Degree:Master

Type:Thesis

Country:China

Candidate:Q Zhang

Full Text:PDF

GTID:2544307127453834

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Cancer is a debilitating disease characterized by uncontrolled growth and division of abnormal cells.Available treatments,such as chemotherapy,radiotherapy,and surgery,can cause significant physical and psychological pain.Mutations in key genes are known to contribute to cancer development,highlighting the importance of studying gene expression data.The advancement of sequencing technologies has led to the production of high-quality gene expression data,which is vital for exploring biological,medical,and disease mechanisms.Machine learning algorithms can be utilized to investigate the relationship between gene expression data and cancer progression,leading to personalized and precise treatment plans for patients.However,analyzing cancer gene expression data is challenging due to small sample sizes,high dimensionality,high noise,and class imbalance.In this study,osteosarcoma and gastric cancer are chosen as representative cancers,combining their unique characteristics with common gene expression dataset features to explore and study various aspects:1)This paper proposes a novel classification model for the classification of osteosarcoma gene expression data based on Weighted Multi-Source Data Fusion(W-MSDF),Excitation-based Convolution Neural Networks,and Support Vector Machines(E-CNN-SVM).The data processing stage uses an improved weighting mechanism inspired by multi-view algorithms to fuse feature extraction information from different data sources,which increases the intrinsic connectivity of small sample data and alleviates the problem of insufficient data volume.Furthermore,this paper proposes the E-CNN-SVM classification algorithm by combining convolutional neural networks and support vector machines,with an incentive mechanism that enhances the weight of core features and improves the performance in classifying small sample data,inspired by squeeze and excitation networks.The experimental results demonstrate that our proposed model can effectively improve the classification accuracy of osteosarcoma gene expression data.2)This paper proposes a classification model,Wavelet Threshold Denoising-Random Forest(WRF),Sample Expanding(SE),and Excitation-based Stacked Autoencoder(ESAE),for the classification problem of gastric cancer DNA methylation data.In the data processing stage,a noise reduction autoencoder is used to randomly destroy some gene fragments to expand the number of training samples and enhance the robustness of the model to prevent overfitting.The WRF algorithm is also utilized to improve the feature selection ability of the model.In the classification module,E-SAE is used to suppress the function of samples with low importance.Experimental results demonstrate that the proposed model can effectively improve the classification performance of gastric cancer DNA methylation data and avoid overfitting.3)Building on the gene expression data classification model proposed earlier for patients with osteosarcoma and gastric cancer,this paper develops intelligent classification software for patients with these diseases in practical medical scenarios.The system is designed and implemented based on the PyQt framework and Python language,following the determination of its requirements and feasibility.By leveraging the gene expression data classification model,the software enables medical staff to efficiently classify and diagnose patients with gastric cancer and osteosarcoma,reducing their workload and improving overall efficiency.

Keywords/Search Tags:

Small sample, Osteosarcoma, Gastric cancer, Gene expression data, Excitation

PDF Full Text Request

Related items

1	Prediction Of Local Recurrence Of Head And Neck Cancer Unimodality Based On Small Sample And High-dimensional Gene Expression Data
2	Research And Application Of Small Sample Data Learning Method In Gastric Cancer Cell Image
3	Research On Low-rank Representation Methods For Cancer Gene Expression Data Mining
4	Research And Application Of Classification Of Small Sample Clinical Data
5	Research On Machine Learning Method Of High Dimensional Small Sample (Medical) Data
6	Research On Analysis Method Of Tumor Gene Expression Data Based On Machine Learning
7	Prediction Of Depression Classification And Biomarker Discovery Based On Small Sample Plasma Mass Spectrometry Data
8	The Research Of Gastric Cancer Feature Genes Selection Based On Gene Expression Data
9	The Effects Of E2F-1 Gene Expression And Celluar Biological Behaviors By Short Interfering RNAs In Gastric Cancer Cell Line MGC803
10	Research On Swarm Intelligence Feature Selection Algorithm For Small Sample(Medical) Data