Font Size: a A A

Identification Of Cancer Driver Gene Model Based On Principal Component Analysis And Neural Network Approaches

Posted on:2018-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhouFull Text:PDF
GTID:2334330512996715Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cancer is one of the major threats to human life and health.It not only causes heavy mental stress and economic burden to individuals and families,but also seriously affects the global economic development and social progress.Research on the mechanism and control of cancer has become the focus of Health Research Institution all around the world.Previous studies have focused on the search for external causes,but little is known about the underlying carcinogenic mechanisms.With the emergence of high-throughput sequencing technologies,it has become possible for us to analyze internal factors.By identifying changes in intracellular gene expression during cancer progression,scientists found driver genes can control cancer tumors.What's more,if these genes or gene paths are suppressed,the development of tumor-related events will be ended.These kinds of genes are the most important internal causes of cancer,which are called cancer driver genes.Cancer treatment may be more effective with the help of targeted therapy about cancer driver genes.Currently,cancer driver genes are predicted primarily by analyzing a large number of sample alignment results.This biological approach is easy to understand,but often requires a great huge number of cancer samples aligned,which costs so much money and material resources.With the rapid development of molecular biology,such as TCGA(The Cancer Genome Atlas)and other organizations provide researchers large numbers of updated and timely data resources.In addition,machine learning and data mining technologies provide strong support for the analysis of the data.Driver gene identification gradually develops accompanied with data analysis.In this paper,we introduced the research background,significance and approaches for studying driver genes.Besides,the application of principal component analysis and neural network approach were introduced in details.Based on these two methods,we proposed a systematic biological model to predict driver genes.This model could be used to derive a set of predicted genes from the genetic data and reduce the systematic errors and faults in the experiment.What's more,the model could effectively reduce the expenditure and the experimental period and provide the basis for the targeted therapy of cancer.In this paper,we selected the glioblastoma multiforme as the experimental object to verify.First of all,we preprocessed and nonnalize the data,we then deal with the experimental sample data by using principal component analysis to filter out non-expression or low-expression data;secondly,inspired by the module network,blocks were generated and the genes with similar mutation rates were divided into the same blocks,and then ranked the blocks.Finally,the Restricted Boltzmann Machine was used to construct the expression network of cancer-related genes to obtain the predicted set of the driver genes,then the results and text mining results would be compared.We found about 80%genes we predicted are included in the results of text mining,which proved the effectiveness and feasibility of the proposed model in some degree.
Keywords/Search Tags:principal component analysis, neural network, gene expression profile, module network, driver gene
PDF Full Text Request
Related items