Font Size: a A A

A Research On Synthesized Visualization Data Management Method Of Molecular Structure And Gene Sequence

Posted on:2014-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:C ZouFull Text:PDF
GTID:2180330479979417Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays, Systems Biology has been developed rapidly. In order to explore the association relationship of Bioinformatics between different levels and areas, an increasingly urgent demand for synthesis integrated visualization of biological data in different levels has been proposed. In recent years, biologists called for a unified integration framework that can integrate different levels and areas of visualization methods to achieve a synthesis visualization of biological data. This integration framework enable researchers could observe from the genes, molecules, cells, organs and even to the whole living tissue. However, conventional bioinformatics visualization tools are mostly developed based on the need of researching a certain category of independent experiment data. For instance, visualization software that are used to observed genomic sequence, molecular structure and genomic comparison are independent and it is difficult for user to find the association relationship between different biological levels and areas data. Therefore, the blueprint that achieving the synthesis visualization for biological data at all levels would be a huge project and will face many technical challenges.This thesis is for the application needs, we researches and discusses the key technologies to achieve synthesis visualization of the molecular field and genomic field. Existing mainstream visualization methods of molecular structure and genomic sequence data visualization have been generally accepted to experts, but the data representations independent of each other, which brings many difficulties to extract the association relationship between all areas data. The organization and management of different levels of biological data is the basis problem for the integrated visualization. This thesis studies the features of molecular structure and genomic sequence visualization data object, synthesis visualization metadata model and metadata generation method, the system architecture of synthesis visualization data management module and related implementation technology, etc. Our main works and research achievements are as follows:1. Genomic sequence and molecular structure synthesis visualization metadata model are proposed based on the analysis of the characteristics of mainstream molecular simulations tools and genomic sequence visualization data sources. By definition metadata model of three types, including the molecular structure, gene sequence and associated information, this thesis established the associated relationship of two levels and areas and defined the description format of associated data. These efforts provide the basis for the theoretical model for the metadata generation and synthesis integrated management of gene sequence and molecular structure data.2. According to the synthesis management needs of the molecular structure and gene sequence data, this thesis has designed a system architecture of synthesis data management module that has the functions including tool data management, synthesis data management and application service management. Tool data management provides unified management for tool document libraries, files database and the directory database. Synthesis data management provides effective management for gene sequence visualization data, molecular structure visualization data and their associated data. Application service management can provides system services data management such as user access data etc. Synthesis data management module is the core for data organization and management of the synthesis visualization integration framework.3. Since the genome sequence and molecular structure visualization data are stored in the specification format text files and it is difficult to fully support synthesis visualization data manipulation. This thesis proposes an XML-based Bioinformatics metadata generation methods. This method, first of all, extracts the metadata from unstructured text data and describes it by XML. And then, according to the pre-defined templates, system converts the semi-structured XML data into structured data form that can be stored in the relational database. This method that is based on the synthesis visualization metadata model and XML implements the unstructured text data that is unavailable to be manipulated into structured data that could be stored in relational database. It is the key technology to achieve synthesis visualization data management.4. On the basis of the above-mentioned research, this thesis designed and implemented a synthesis visualization data management module of a molecular structure and genomic sequence and embedded the module into integrated framework. Then, it built a synthesis visualization integration framework of molecular structure and genomic sequence and use it to integrated molecular structure visualization tool(VMD, mainstream and open source) and genomic sequence visualization tool(JBrowse, mainstream and open source). Ultimately, this thesis implements a molecular structure and genomic sequence data synthesis visualization prototype system. The prototype system initially realized several synthesis visualization functions and achieved ideal experimental results. All of the above works demonstrate that the data organization and management technique and method proposed by this thesis are rationality and effectiveness.
Keywords/Search Tags:Bioinformatics Visualization, Synthesis Visualization, Metadata Model, Data Management, Integration Framework
PDF Full Text Request
Related items