Font Size: a A A

Gene Selection About Plant Stress Response Based On Neighborhood System And Rough Set Theory

Posted on:2015-10-22Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2180330467485812Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In plant growth, plant is often subjected to various stresses, such as adversities, plant diseases and insect pests. To seek the important genes and study the mechanism of plant stress response, are meaningful in agriculture, forestry, environmental protection and so on. The development of gene expression data acquisition technology provides the possibility for such research, while brings a new challenge for data processing and analysis. As gene expression data has high dimension, small sample and high redundancy, how to build a rough set model with well capability of data processing and design the gene selection method are hot research topics in bioinformatics and application of rough set.In order to improve the data processing capability of rough set moded based on neighborhood for gene expression data, two kinds of neighborhoods are discussed,8neighborhood and intersection neighborhood, which can deel with numerical data directly. Then a significant gene selection algorithm based on positive region and gene ranking is proposed. Contrast experiments are designed to analyze the performance of two kinds of neighborhoods and two kinds of approximation operators, approximation operator based on elements or elementary sets. Experimental results on four data sets about plant stress show that the proposed algorithm can select significant genes closely related to plant stress. Meanwhile, contrast experimental results demonstrate that the approximation operator based on elementary sets is better; two kinds of neighborhoods have their suitable data sets respectively, but the definition of intersection neighborhood is more flexible.To further show the advantage of intersection neighborhood and optimize neighborhood thresholds, multiobjective optimization method is introduced, because there are multiple criterions need to be considered in gene selection. A significant gene selection algorithm with thresholds optimization is proposed and it can select significant gene subset during thresholds optimization. Experimental results confirm that the proposed algorithm can improve the classification accuracy of the selected gene subset or decrease the number of genes in a certain extent; the approach of setting different thresholds for different genes in intersection neighborhood can enhance the adaptability.As gene selection methods only based on gene expression data have a limitation in results interpretability, gene ontology knowledge is introduced and a novel knowledge representation model based on neighborhood system theory is constructed. This model can present the information from the two kinds of data sources in the same time. On the basis of the built neighborhood system, propose new framework and method for gene selection using neighborhood system based rough set model. Experimental results on two Arabidopsis thaliana data sets show that the proposed method can select significant gene subset with high classification accuracy and well biological interpretability.
Keywords/Search Tags:Neighborhood System, Rough Set, Plant Stress Response, Gene Selection
PDF Full Text Request
Related items