Font Size: a A A

Research On Key Technology Of Grass Carp Genome Information Management System

Posted on:2020-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:M TangFull Text:PDF
GTID:2393330590983823Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the release of the draft genome of the grass carp,further exploration on molecular mechanisms of economic value of grass carp have attracted wide attention.The complex biological mechanisms are required to be researched at the whole-gene scale.To study whole genome of the grass crap,establish a genome information management system is the basis to manage and store these heterogeneous data,and the data should be visualized specifically to make it more convenient and intuitive for researchers to browse,obtain and study them.Heterogeneous data brings problems in data management and storage.This paper systematically compiled annotation data of the grass carp that needs to store and manage,and analyzed the relationship between the grass carp genome annotation data and the organization of several international genomic data files.A genome annotation data model was set up to manage the grass carp genome data based on GFF3 format.Simple and advanced retrieval of the database can be achieved through various attributes stored in the data model.In addition,several data format conversion scripts have been created to automatically convert various formats of genomic data files into GFF3 format for managing heterogeneous data uniformly.With the continuous increase of sequencing data,in order to reduce the consumption of storage space and facilitate the transmission of DNA sequence data,this paper also conducted relevant research on the DNA compression algorithm.The current research situation and bottleneck of DNA compression algorithm are analyzed.Combined with the characteristics of grass carp DNA data,the bioinformatics characteristics of grass carp sequence are introduced in the designed compression algorithm,which achieves better compression effect than traditional algorithm and effectively reduces the cost of data storage and transmission.In the research process of the subject,an information management system for grass carp genome data was developed.At the same time,JBrowse-based of genome annotation data integration interface was designed and implemented,and the open source JBrowse module was used as the visual component.For some data that cannotbe visualized with JBrowse,we uses SVG graphics format to realize the visualization of them.The efficiency of data mining can be further improved by visualizing data and enhancing its association with other biological databases,coupled with rich search capabilities.This information management system can effectively realize the comparison and sharing of data,targeted visualization solutions enable researchers to better mine more meaningful information from massive data,which provides an effective way to promote the development of biological genetics.The designed DNA compression algorithm – BioGenCompress has achieved good compression on grass carp DNA,saving storage space and reducing transmission costs.The relevant methods in the system can provide certain reference value for the construction of other genomic data platforms.
Keywords/Search Tags:genome information management of the grass carp, annotation data model, visualization technology, DNA compression algorithm
PDF Full Text Request
Related items