Identification Of Secretory Proteins In Mycobacterium Tuberculosis

Posted on:2018-07-02

Degree:Master

Type:Thesis

Country:China

Candidate:H Yang

Full Text:PDF

GTID:2334330512489810

Subject:Biophysics

Abstract/Summary:

PDF Full Text Request

Mycobacterium tuberculosis,short for tubercle bacillus,is a sort of slightly curved bacilliform aerobic bacteria.It is strongly insensitive to external environment under the protect of cell wall lipids and bacterial capsule,and it has been proved to be the causative pathogen of the contagious disease tuberculosis(TB).China is one of the most heavy-burdened country all over the world that bearing the TB disease,which is killing as many as 1 million lives every year.As a kind of chronic respiratory infectious disease,TB has an unconspicuous early symptom and long treatment cycle,which makes it an easy case to be prevalent among the population and get out of hand.Over the ten decades,despite myriad medical experts have devoted to studying the molecular structure,toxicity and pathology of MTB,there has not been a drug that can prevant or cure it absolutely for its complex membrane structure and frequent gene mutations.Recent studies suggest that secretory protein antigens can be used to detect antibodies in infected specimens,so distinguishing secretory proteins from non-secretory proteins is a matter of grave concern for tracing the real pathogenic factors and developing vaccines or drugs against TB.In this work,we developed an algorithm to recognize the secretory proteins of MTB and provided the online service.Firstly,we constructed the standard data sets of MTB proteins,which are collected from the experimentally confirmed records of UniProt.After removing the redundant sequences to the utmost extent by the CD-HIT online service,a positive dataset containing 35 samples and a negative dataset containing 266 samples were finally obtained.Then,we extracted the g-gapped dipeptide compositions and physical-chemical property features to encode each protein sequence into its unique feature vector.Eventually,we built and trained the model by the popular SVM algorithm,to improve its prediction power further,we performed the feature selection procedure on the basis of the optimal model parameters.As a result,each peptide sequence was translated into a 374-dimension feature vector,including 9-gapped dipeptide compositions and hydrophilic/hydrophobic properties.Validated by jackknife-test,the algorithm we proposed got an averaged accuracy of 87.18 percents,and the area under the operating curve was as large as 0.93.To illustrate the superiority of the model based on SVM,we reconstruct the model on the same standard dataset using Random Forest and Bayes Network as well as RBF Network which are all embedded in Weka software.It is demonstrated by jackknife again that,the model based on SVM is better than the other three on the issue of predicting the secrtory proteins of MTB.For the convenience of researchers in relevant fields to communicate research progress and share scientific achievements,the interface-friendly online service MycoSec(http://lin.uestc.edu.cn/server/MycoSec/)is opened and free for non-commercial use.

Keywords/Search Tags:

mycobacterium tuberculosis, secretory protein, g-gapped dipeptide pseudo amino acid composition, support vector machine, analysis of variance

PDF Full Text Request

Related items

1	Construction Of A Eukaryotic Expression Vector Of Mycobacterium Tuberculosis 38 KD And ESAT-6 Protein And Its Secretory Expression In HEK 293T Cells
2	The Inhibitory Effects Of The Virulence Factor Secretory Acid Phosphatase Of Mycobacterium Tuberculosis (SapM) On The Autophagy Of Murine Macrophages
3	The Study Of Mycobacterium Tuberculosis Secreted Protein Mpt64 In Tb Diagnosis
4	The Dynamic Analysis Of The Status Of Mycobacterium Tuberculosis Infection Among Medical Staff And The Research On The Efficacy Of The New ELISA In The Diagnosis Of Tuberculosis
5	Research On Encephalic Tissue Recognition For MR Image Based On Support Vector Machine
6	Study On The New Crystal Amino Acid Formulas Enriched Branched-chain Amino Acid And Glutamine Dipeptide
7	Fabrication And Characterization Of A RFN/CDH Nano-scaled Ceramic Surface Via LbL Self-assembly And Prediction Subcellular Localization Of RFN/CDH
8	The Function Of Oxidation Sensor ILFT In Mycobacterium Tuberculosis
9	Machine Learning Methods For B-cell Epitope Prediction Based On Side-chain Information Of Protein
10	Anticancer Drug Response Classification Based On Deep Neural Network And Support Vector Machine