Font Size: a A A

Diagnosis Of Multi-cancer And Selection Of Keg Genes Based On Multinomial Regression

Posted on:2020-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WangFull Text:PDF
GTID:2404330578966216Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Multi-classification based on microarray gene expression data has attracted much attention in recent years,which usually faces two challenges:class imbalance and gene selection,especially the selection of the overlapped genes.In this paper,two kinds of reg-ularized multinomial regression are proposed and corresponding regularized path-solving algorithms are developed.Furthermore,experiments are carried out on two kinds of mi-croarray gene expression data.The main research contents are as follows:(1)Traditional binary classification of acute leukemia can be further considered as a three-class classification problem.However,the selection of cancer-causing genes,especially the selection of the overlapped genes,is a challenging issue in performing multi-classification.An overlapped grouping strategy according to the weighted gene co-expression networks is proposed and a novel regularized multinomial regression with over-lapping group lasso penalty(MROGL)is presented which can perform multi-classification and grouped gene selection simultaneously.The proposed method can effectively identify the grouped genes which work synergistically with others of three-class acute leukemi-a.Note that each overlapped gene is deemed as a new one according to the overlapped grouping strategy.Hence the overlapped genes between groups are highlighted.Further-more,MROGL outperforms other five methods on multi-classification accuracy.(2)Improving the classification accuracy of minority classes should be pay more at-tention when performing multi-classification of imbalanced data.The overlapped group-ing strategy is enriched which explores the group structure of each class and equally put groups for all classes together,thus highlighting significance of gene groups from the mi-nority classes.Furthermore,data-driven weights are constructed according to information theory.On the basis of grouping strategy and data-driven weights,a regularized adap-tive multinomial regression with sparse overlapping group lasso penalty(AMRSOGL)is proposed and a regularized path-solving algorithm is developed.The experimental re-sults show that the proposed method can effectively improve the classification accuracy of minority classes while ensuring the accuracy of the total classification.Furthermore,AMRSOGL can not only select the key gene groups for each class in performing multi-classification,but also adaptively select key genes and overlapped genes within each group.Due to the introduction of network analysis and information theory,the process of multi-classification and gene selection has obvious biological significance.
Keywords/Search Tags:Overlapping group lasso, weighted gene co-expression networks, multi-classification, imbalanced data, key gene selection
PDF Full Text Request
Related items