Font Size: a A A

Analysis Of DNA Sequences Based On Data Field And 3-D Graphical Representation

Posted on:2021-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhengFull Text:PDF
GTID:2370330623975212Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
For the past few years,with the rapid development of biotechnology,the study of biology has gradually shifted from accumulating data to interpreting and analyzing data.So Bioinformatics has emerged and rapidly become the new frontier for biological studies.Bioinformatics,as the name suggests,it is the study of biological data information.Bioinformatics is a developing interdiscipline,which is based on mathematical calculation methods,computer program code and the knowledge of other subjects,and used to store,retrieve and analyze biological information.The research area of bioinformatics is abundant.This thesis mainly studied in the aspect of sequence analysis.The main contents are as follows:The 2-D graphical representation of DNA sequences proposed by Nandy has been used in many bioinformatics problems.Regrettably,this graphical representation is degeneracy.GUO improves it by rotating a small angle the four directions in a two-dimensional space later,this method greatly reduces the degeneracy of the graphical representation.However,the phenomenon of degeneration has not been completely avoided.Inspired by the improvement ideas of Guo and others,by associating four bases(A,C,G and T)with four direction vectors in three-dimensional space,a 3-D generalized Nandy graphical representation of DNA sequence is proposed.It is proved that there is no any circle in the graph,and this guarantees the graph has nondegeneracy.We numerically characterize a DNA sequence by means of L/L matrix's ALE-index and graph radius.By absorbing and drawing the concept of gravity field in physics,a potential function among data objects in vector form is constructed,which value can reflect the relationship between sequences.K-nearest neighbor algorithm serves as a classifier.The utility of the proposed approach is illustrated byrecognition of 208 RIG-I genes.
Keywords/Search Tags:Graphical representation, Numerically characterize, Data field, Potential function, RIG-I genes, Sequence analysis
PDF Full Text Request
Related items