Font Size: a A A

Weighted voting ensembles for high dimensional data

Posted on:2016-10-29Degree:M.SType:Thesis
University:California State University, Long BeachCandidate:Hordoan, Liliana AFull Text:PDF
GTID:2476390017983243Subject:Statistics
Abstract/Summary:
In recent years, the understanding and development of microarray data has grown intensively to the benefit of medical science. One of these areas involves the application of statistical algorithms to categorize disease, treatments, cancers, outcomes, etc., especially for high-dimensional data. This thesis investigates the application of two different types of decision voting schemes for the classification of outcomes using microarray data. A weighted adjusted voting scheme is compared to the standard majority voting scheme for classification algorithms in different type of ensemble models. The investigation starts with decision trees as base classifiers, and then works on the improvement of the ensemble structure to investigate how the weighted adjusted voting scheme performs on actual microarray data. Due to the structure of high-dimensional data, cross-validation is used to evaluate the validity of the statistical analysis. Variable importance is considered in this research to improve model efficiency by selecting top-ranked genes via Random Forest. Then accuracy is assessed on different ensemble methods to draw conclusions on the performance of weighted voting scheme compared to average majority voting.
Keywords/Search Tags:Voting, Data, Weighted, Ensemble
Related items