Font Size: a A A

Cancer Classification Research Based On Gene Expression Profile

Posted on:2015-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2404330488999814Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Cancer is a complex disease which has seriously threatened people' healthy.There are various kinds of cancer and no one is easy to cure.Early diagnosis would help save patients' life to a certain degree.However traditional cancer diagnosis methods have their own limitations and couldn't solve this problem effectively.Gene chip,also named DNA microarray,as a great technical breakthrough in bioinformatics since last century has make it possible to measure thousands of genes' expression simultaneously in a single experiment.Since then,gene chip has been used as a basic tool in cancer classification researches.By using gene expression profile,cancer classification problem could be analyzed and explained at a molecular level.Nowadays,the researches on cancer classification based on gene expression mainly focus on two aspects:on one hand,due to the characteristics of gene expression data such as high dimension,small sample,high redundancy and noise,how to select the most relevant genes from the big data become a key task;on the other hand,So far morphology based cancer diagnosis methods are still primary means as far as clinical diagnosis.So to find an effective and stable classification algorithm is another important task.In connection with these two aspects,this article referred to lots of related literatures and learned the general procedure of cancer classification problem and the corresponding classification methods and feature gene selection methods.Then we launched the following works:1.Summarized the concept of ensemble learning and proposed a data-based ensemble classification method---single point ensemble classification.This method applied a novel strategy.It would extract base classifier and ensemble rules at the same time while selecting feature genes.Then chose colon cancer dataset and acute leukemia dataset for experiments and acquired satisfied results with very low time and space consuming which proved the effectiveness of SPEC.2.Pointed out the shortages and limitations of SPEC and put forward ideas for improvements.Aim at improving SPEC,proposed an interval-based ensemble classification method.Also chose colon cancer dataset and acute leukemia dataset for performance tests and obtained better classification accuracy than SPEC and other common used methods.Beside tested this method on a multi-label dataset---Mixed Lineage Leukemia dataset and acquired good result.
Keywords/Search Tags:Gene chip, gene expression profile, cancer classification, feature gene selection, ensemble learning
PDF Full Text Request
Related items