Font Size: a A A

Design And Implementation Of Protein Thermostability Classification System

Posted on:2017-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:J K ZhangFull Text:PDF
GTID:2271330482997502Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As the main components of life activities and natural biological catalysts, proteins have great potential of application and good prospect for development in industrial production. However, most proteins are mesophile. The poor thermostability of the proteins greatly limits their application in industry, because industrial production processes are often exposed to high temperatures. Exploring the mechanism of protein thermostability through pattern recognition methods and looking for ways to improve the thermostability of a protein has been an important research direction in computational biology and protein engineering. Develop a protein thermostability classification system which can classify protein thermostability effectively will help the researchers to explore the mechanism of protein thermostability.The main function of the system is classifying the protein thermostability and finding the features that contribute to protein thermostability through the classification model. This system provides a large number of protein sequences as the training set, and provides feature calculation, feature selection, classification model building, protein thermostability classification, results analysis and data file export functions for users. The results from the system can provide theoretical support for the experiments of reforming protein structure and improving the protein thermostability.On the basis of the MyEclipse platform, the system was developed by Java, architected by Spring MVC, and MySQL was used as the database for the system. This system mainly includes data preparation, data classification, results analysis and system management modules. The system calculated 430-dimensional features based on protein sequence and reduced the feature dimension through Information Gain, Information Gain Ratio and Relief in data preparation module. The system created a combination classifier through Adaboost in data classification module. The basic classifier was built by Support Vector Machine.With the test, this system can classify protein thermostability effectively and find features that contribute to protein thermostability. It achieves the goals of development on the functional and performance requirements.
Keywords/Search Tags:protein thermostability, data classification, feature selection, Support Vector Machine, Adaboost
PDF Full Text Request
Related items