Font Size: a A A

Graphic Construction And Similarity Analysis Of Protein’s Primary Sequences

Posted on:2016-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:S C XuFull Text:PDF
GTID:2180330467973253Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Recent years, with the growing number of biological sequences in the genbank and theimprovement of sequencing technology, more and more scholars focus on the study aboutcomparing methods of non-sequence genome with DNA sequences and protein sequences. Thecomparing methods of non-sequence genome have better visual and partial feature than thecomparing methods of sequence genome, so the comparing methods of non-sequence genomecan excavate more biological information. This paper proposes a new graphical method ofprotein’s primary sequence for the similarity analysis and phylogenetic tree construction with thedifferent spice’s sequences based on comparing methods of non-sequence genome. The specificworks are as follows:1. On the basis of physicochemical properties of amino acids, like hydrophobicitypK α(COOH) andpK (NH+α3), we map these amino acids of protein sequences to thethree-dimensional space coordinates and make their distributions uniformly by the conversionformula. Then we get the data points of protein sequences by the formula of iterative summation.Finally, a continuous cubic B-spline curve interpolating the amino acid points is constructed torepresent the shape of each protein sequence.2. The geometric properties (curvature and torsion) of space curve of protein’s primarysequences are extracted after getting the protein curve. In the3D space, they can describe thedegree of curvature and torsion of the protein curve. Next, we introduce several commonmethods of similarity measurement of eigenvector and propose a new circulation distancemethod based on the Canberra distance. Then, we do the test and find our method is feasible bycomparing the correlation coefficient with the standard result gained by the ClustalW. Themethod of our paper has its advantages compared with the methods in other literature. At last, wefind the experimental result in our paper is not occasional by the t-test of the correlationcoefficient.3. We construct a homologous phylogenetic tree based on the similarity matrix from theexperimental result. The specific experimental process is that we describe the evolutionary distance of nine spice’s ND5proteins and fifteen spice’s β globin proteins by the algorithm ofhierarchical clustering and we compare them with the phylogenetic trees from the software ofMega and DNAstar. We find our result is reliable.
Keywords/Search Tags:Protein sequence, Similarity analysis, Cubic B-spline curve, Curvature, Torsion
PDF Full Text Request
Related items