Research On Some Key Technologies Of Computer Visualization Of Genomic Information | | Posted on:2011-02-18 | Degree:Doctor | Type:Dissertation | | Country:China | Candidate:A H Wang | Full Text:PDF | | GTID:1228330395958564 | Subject:Computer system architecture | | Abstract/Summary: | PDF Full Text Request | | In recent years, with the rapid development of sequencing techniques in biology, vast volumes of genomic sequence data are collected and stored. Effective methods supported by the newest computer technology are urgently needed to analyze and dispose the massive genomic sequence data so as to obtain valuable information. Computer visualization provides a new research method for genomic sequence research due to its features of intuition and quickness, and thus has received the attention by more and more researchers. Adopting computer visualization method for genomic sequence, genomic data, which are difficult to analyze and understand, can be transformed into visible computer images, so that to help researchers to find out the characteristic information concealed in the genomic sequences. In this dissertation, researches are carried out around computer visualization method for genomic sequence data. Major works finished are as follows.(1) A visualization expression method for GC content in genomic sequence is proposed, named GC Content Double Triangular (GCDT) method. By means of computer visualization techniques, this method transforms the distribution and variation of GC content in a genomic sequence into image form expression. The color of pixels in different positions indicates the GC content of different segments in a genomic sequence. Thus large amount of GC content information can be expressed in one image simultaneously. For a given genomic sequence, by means of GCDT method, a series of statistical GC content values obtained from windows with different sizes are expressed by all the pixels of one rectangular image that is comprised of colored lower triangle and upper triangle based on a specific rule. On the image, the distribution and variation of the GC content in the genomic sequence can be observed clearly. Thus it is very convenient for intuitively and quickly observing and analyzing the GC content in the genomic sequence, and hence a very good tool is provided for the research on the GC contents in genomic sequences.(2) An identification algorithm for GC isochore boundary is proposed based on GCDT method. GC isochore structure is one of the important characteristics of genomic sequences of some species. Identifying the boundaries between the neighboring GC isochores is the basis of GC isochore research. The identification algorithm for GC isochore boundary based on GCDT method comprehensively considers the statistical information of GC content in multiple windows with various sizes, so that the selected boundaries are more reasonable, which is of great significance on GC isochore research.(3) By means of the identification algorithm for GC isochore boundaries based on GCDT, a GC isochore map of the whole human genomic sequence is obtained. From the GCDT image of the whole human genomic sequence, the existence of mosaic structure can be clearly observed. Boundaries between neighboring GC isochores can also be observed clearly. This means that GC isochore structure really exists in human genomic sequence. By using the identification algorithm for GC isochore boundary based on GCDT, the isochore organization of human genomic sequence is systematically identified, and boundaries of the GC isochore structures of each human chromosome is found. As GC isochore is related to many functions of genomic sequences, GC isochore map is of significance for the research on human genomic sequences.(4) A GIDT visualization method is proposed which is applicable to general characteristics of genomic sequences. The GCDT method is extended so that a new visualization method is proposed which is applicable to more extensive application areas. This method can be used for visual representation of genomic characteristics such as GC content, the contents of various kinds of bases and their combinations, the direction and level of the base skew, SNP, and etc. By means of GIDT method, the characteristic information of genomic sequences can be displayed intuitively on a two dimensional image on the computer screen. Thus an effective tool for research on general characteristics of genomic sequences is provided.(5) DHPC visualization method is investigated and implemented. DHPC visualization provides a very good method for representing the overall information of large scale genomic sequences. By using the feature of Hilbert-Peano curve which can fully fill the total square area, one dimensional genomic sequences can be mapped into two dimensional images. Organizational information of genomic sequences can be expressed on the formed DHPC images at one time. The global or the local characteristic information of genomic sequence can all be expressed on the DHPC images. DHPC method is analyzed and investigated in this dissertation. Some key factors are discussed and the corresponding software system is realized.(6) Software systems for the genomic visualization expression methods discussed in this dissertation are designed and realized. Practical software systems are designed and realized for all the visualization expression methods introduced. The basic structure, module features, workflow, and operating environment of these software systems are introduced respectively. And the key problems and difficulties about software design and realization are analyzed.In this dissertation, some key problems with genomic visualization expression methods are explored, and in depth researches are carried out in the visualization expression of genomic characteristics and the whole genomic sequence. Some kinds of practical visualization expression methods are proposed, which provide new tools and methods for genomic research. Good results have been obtained in practical applications. This means that genomic visualization expression methods and techniques have obvious advantages and widespread application potential in genomic research. Some beneficial attempts in genomic visualization expression are carried out. These methods will be improved in the future and be attempted in more application areas. | | Keywords/Search Tags: | visualization method, bioinformatics, genomic sequence, GC content, isochoreboundaries, GC Double Triangles, Genome Information Double Triangles | PDF Full Text Request | Related items |
| |
|