Font Size: a A A

Analysis Of SARS-CoV-2 Genome Sequence Based On Variant Measurement

Posted on:2022-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:F DengFull Text:PDF
GTID:2480306335456824Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
The rapid development of the information era has spawned many mature software products and technologies,including gene sequence comparison software and gene sequencing technology.At present,the gene bank is becoming increasingly large.DNA is more important for the development and normal operation of organisms.Now how to effectively use these DNA sequences to analyze biological characteristics and obtain biological genetic information has become a challenge in the current scientific field.However,a large amount of DNA analysis not only increases the workload of researchers,but also is prone to errors.Visualization,as a research tool that is as important as experiment,theory and method,can handle as cumbersome and complex data as DNA sequence.It is intuitive,simple and effective.Significant advantages make it an indispensable part of data analysis methods.DNA sequence visualization has always been a hot topic in bioinformatics research,and now it has gradually extended to various fields and has become one of the research contents in these fields.SARS-CoV-2 is an RNA coronavirus that seriously threatens human life and health and disturbs the social order.SARS-Co V-2 has an error-prone RNA-dependent polymerase(RDRP),so mutations and recombination events often occur.According to the SARS-Co V-2genes provided by research institutions in various countries,SARS-Co V-2 evolution map is generated on the GISAID website.The figure points out that SARS-Co V-2 can be divided into 9 clades,and each clades contains multiple branches,and all samples relative to a common ancestor,there are up to eight mutations that are highly correlated,which indicates that the RNA sequence of the SARS-Co V-2 is constantly changing.With such a highly mutated,fast-spreading,and widespread virus,the task of studying its RNA sequence distribution and analyzing genetic data has become more and more urgent.This article is mainly based on the theory of variant logic combined with visualization methods to study the gene sequence of the SARS-Co V-2.The variant map system includes three parts: data processing,variant measurement and variant projection.Data processing:Introduce a variable parameter to segment the RNA sequence,and cut the whole genome sequence into multiple sub-sequences;Variant measurement: count the number of bases in each sub-sequence,and calculate the base probability;Variant projection: nucleotide probability is reorganized to form a probability matrix,and the probability matrix is projected onto the coordinate axis to form a variant map.The experimental results combine the gene sequence projection diagram and the average information entropy projection diagram.The first set of experiments shows that the Wuhan sequence is the most similar to the SARS-Co V;the second set of experiments shows that the Wuhan sequence is the most similar to the environment;the third set of experiments shows from the RDRP sequence analysis that the variant map system can determine the mutant base,and the information entropy projection icon indicates that the information content of the RDRP fragment is similar.
Keywords/Search Tags:SARS-CoV-2, Variant measurement, RNA visualization, Entropy
PDF Full Text Request
Related items