Font Size: a A A

Research On Novel Feature Gene Selection With Gene Expression Data

Posted on:2019-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:S H LiuFull Text:PDF
GTID:2404330542996772Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Since its discovery,cancer has been one of the difficult problems that humans can hardly overcome.Due to its complex and ever-changing characteristics,it is still a major challenge faced by the clinical medicine field today.Essentially,the cause of cancer is a genetic disease caused by the differential expression of genes within cells.Gene chip(genomics)technology is one of the most important breakthroughs in bioinformatics in this century.It can obtain the gene expression data of experimental samples at the same time.Therefore,this technology has opened up a new path for the treatment of cancer.The technique of histology can simultaneously measure a large amount of gene expression data that includes basic information about various biological processes as well as the current physiological state of the cells or tissues.However,there is a large amount of unrelated genetic data in the gene chip data obtained by using this technology.The cause of cancer is often only a small part of the genes.Therefore,relevant disease-causing genes are removed from these data and the unrelated irrelevant genes are eliminated.Genetic data is crucial in this direction-related research.In view of this,this paper mainly focuses on the selection of characteristic gene selection algorithms.The purpose is to select the related pathogenic genes from the data of the gene chip.The main research contents are as follows:1.Based on the idea of SCAD algorithm and KBCGS algorithm,this paper proposes an improved algorithm,which greatly optimizes the computation time and classification accuracy of KBCGS.In the KBCGS algorithm,the core algorithm for screening genes uses the Gaussian kernel,and in this paper Double Gaussian kernel functions are used.Through the mixing of two kernel functions,the algorithm takes into account both the global characteristics of the data and its local characteristics.2.After acquiring the relevant feature genes,several common classifiers such as Support Vector Machine(SVM)and K Nearest Neighbor(KNN)are combined to perform analysis on multiple classic chip datasets,and the two genes proposed in the paper are analyzed.The selection method is compared with other popular feature gene selection methods.The results show that the gene selection method proposed in this paper has achieved better results in these data sets,thus verifying the feasibility of the proposed method.
Keywords/Search Tags:gene expression, tumor, kernel method, supervised learning, feature selection
PDF Full Text Request
Related items