Font Size: a A A

New methods for classification, prediction, and feature weighting in bioinformatics

Posted on:2016-07-08Degree:M.SType:Thesis
University:University of Houston-Clear LakeCandidate:Nguyen, Duong BFull Text:PDF
GTID:2470390017477449Subject:Computer Science
Abstract/Summary:
Machine learning is a highly researched topic in computer science and employed extensively in bioinformatics. Classification and prediction are two fundamental tasks in machine learning that can be used to analyze data, extract models, and predict future data trends. This research presents new techniques in learning and prediction in bioinformatics. Specifically, new classification, prediction, and feature weighting techniques have been proposed and evaluated within the machine learning domain and applied into two bioinformatics tasks: Protein subcellular localization and biomedical document classification. Protein subcellular localization prediction is an important task with significant applications like the discovery of new protein disease associations and understanding the biological processes and molecular functions that proteins are involved in. The proposed techniques include a new parametric inductive classifier applied to protein subcellular localization using improved features of proteins extracted from protein sequences. The method is effective in inducing the features from protein sequences in multiple localizations. The methods were implemented and evaluated into the protein subcellular localizations prediction as well as biomedical document classification tasks and integrated into Weka, a well-known machine learning open source platform, for researchers to use them. The evaluation results are impressive and encourage more research in this direction. The methods will have significant contribution into advancing great deal of research work and projects that rely extensively on protein and gene sequence for various applications.
Keywords/Search Tags:Prediction, Classification, Bioinformatics, Protein, New, Machine learning, Methods
Related items