Font Size: a A A

Bioinformatic Analysis Of SARS-CoV Genome And Putative Structural Proteins Sequences

Posted on:2005-11-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:S C LiuFull Text:PDF
GTID:1104360122990950Subject:Cell biology
Abstract/Summary:PDF Full Text Request
INTRODUCTIONSevere Acute Respiratory Syndrome Coronavirus (SARS -CoV) is a novel coronavirus. Most of its structure and functional characteristics are unknown. The aim of the project was to analyze the variations in the different - sourced SARS - CoV isolates complete genome sequence and find the characteristic of the variation and variability of the nucleotides. The geographic evolution and epidemic features were analyzed through phylogenetic analysis of the 59 SARS -CoV nucleotide sequences and co - variation distribution.The four different - sourced SARS - CoV isolates structural proteins, including spike protein, nucleocapsid protein, membrane protein and small envelope protein were analyzed with bioinformatic tools to gain knowledge of their physical and biological features, mutation status and the influence on the structure and function of the protein sequences by the mutations and to provide a theoretical basis for the vaccine development.MATERIALS AND METHODSObtaining 59 different -sourced SARS -CoV isolates complete genome sequences from Genbank and aligning them with ClustalXl. 83 to find the variation sites and its distribution along the sequence, drawing the cladogram.Submitting the FASTA format of spike protein amino acid sequences to bioinformatic server on the web to predict motifs, signal peptides, low - complexity regions, transmembrane segments and antigenic determinants and to find the differences among the different isolates.The nucleotide and amino acid sequences of 41 SARS - CoV isolates puta-live spike protein, 44 nucleocapsid protein, 39 membrane protein as well as 36 small envelope protein were aligned with ClustalXl. 83 to find the variations and mutations on the sequences. The DNATools and ProtParam software at ExPAsy server were adopted to calculate the molecular weight, isoelectric point, amino acid component, molecular formula, quantity of atoms, half - life, instable index, etc. The transmembrane helix, coiled coil, signal peptide were predicted with TMHMM server 2.0 and SignalP2. 0. The motifs and antigenic determinants on the structural protein sequences are predicted with SMARTS. 4 N Predicting Antigenic Peptides^The PredictProtein Server, and the influence on the function and structure of the structural proteins due to mutations were analyzed.RESULTSTotally 477 sites variations were identified in the SARS - CoV genome sequences with variability of 0. 474 , including 28 deletions, 71 inserts and 378 nucleotide substitutes. The quantity of substitutes occurred in nucleotide A, T,C, G is 115, 113, 87 and 65 respectively. The 59 SARS - CoV isolates can be divided into three groups according to the polygenetic tree analysis.The molecular weight of SARS - CoV putative spike protein is 139109. 1D. The isoelectric point is 5. 65. There are 30 mutations occurred in 20 sites of 10 isolates. No mutations occurred in the rest 31 isolates spike protein sequence. Leucine and threonine are in the highest proportion in the amino acid component, while tryptophan the lowest. A cysteine - rich domain of 20 nucleo-tides in length was predicted to be located near C - termini. Three low - complexity regions, one coiled coil, one TMH and one probable signal peptide were predicted in the spike protein sequence. A globularity and a Pfam domain was predicted with predictprotein server. There are three helices structure in the spike protein sequence that is supposed to be the N helix, M helix and C helix respectively. Totally 73 motifs were found in all different - sourced isolates spike protein sequence except GD01. One more Casein Kinase II phosphorylation site was inserted in GD01. Eight of the ten mutation sites found in the spike protein sequences occurred in the antigenic determinants sequence making the determi-nants changed either in amino acid composition or in the quantity of the determinants.The molecular weight of SARS - CoV putative nucleocapsid protein is 46025.0 D. The isoelectric point is 10.93. There are 9 mutations occurred in 7 s...
Keywords/Search Tags:Severe Acute Respiratory Syndrome, SARS, Structural protein, Bioinformatics, Mutation analysis
PDF Full Text Request
Related items