Application Of Web Services And XML In Bioinformatics Data Distribution And Integration

Posted on:2005-07-18

Degree:Master

Type:Thesis

Country:China

Candidate:X Li

Full Text:PDF

GTID:2120360152955371

Subject:Genetics

Abstract/Summary:

PDF Full Text Request

In 90s of last century, the development of human genomics project indicated that biology has stepped into the age of genome. The most remarkable characteristic of the age is the volume of biology data growing at an exponential rate. More and more genomes are being sequenced and annotated, and the data of proteins and genes are accumulated. With the rapid development of WWW (World Wide Web), biological data are mostly digital and stored in a wide variety of formats in heterogeneous systems. Biological data exist all over the world as various web sites, which provide biologists with much useful information. However, the complexity of biological data and the variety of data formats make it difficult to retrieve and integrate the interesting data. Comparing with the traditional structured data, the biology ones locating at Web are semi-structured or no-structured, and have heterogeneous formats. Therefore, retrieving and integrating biology data is a very important task. Recently, it is widely recognized that exchange, distribution, and integration of biology data are the keys to improve bioinformatics and genomics in post-genomic era. The extensible Markup Language (XML) is rapidly spreading as an emerging standard for structuring document for exchanging and integrating data on the World Wide Web (WWW). Web service is the next generation of WWW and founded upon the open standards of W3C (World Wide Web Consortium) and IETF (Internet Engineering Task Force). This paper presents XML and Web Servicestechnologies and their use for an appropriate solution to the bioinformatics data exchange and integration problem.A number of differentially-expressed cDNA fragments were obtained from Phanerochaete chrysosporium by using Suppression Subtractive Hybridization (SSH) and Microarray techniques and 433 of them were sequenced. To manage and analyze these EST data, based on Linux operating system, the Phrap, EMBOSS, Blast, GENSCAN, MZEF software were used to construct a platform. The platform includes constructing EST and genome databases, removing vector sequences, sorting and assembling sequences, locating on genome, identifying exons and introns, and predicting genes. Moreover, using bioperl modules, the scripts written with perl language enable analysis automatically. Results demonstrated that the robust platform could accelerate data analysis for large-scale EST sequences and offer useful information for cloning correlative genes and studying the functional genomics of Phanerochaete chrysosporium.

Keywords/Search Tags:

biological data integration, biological data distribution, extensible Markup Language (XML), web services, bioinformatics, Phanerochaete chrysosporium, EST

PDF Full Text Request

Related items

1	Study On A GML&SVG-Based Model And Its Application
2	Web Services-Oriented Geographic Spatial Data Integration And Visulization
3	Key Technique And Prototype Of Mobile Device Oriented Olympic Games Space-Info Service
4	Research On A GML Query Mechanism
5	The Research And Application Of WebGIS Technology Based XML-SVG
6	Studies On The Gene Expression Of Different Metabolic Phase Of Phanerochaete Chrysosporium
7	Research Into Data Quality Control In Geographical Data Input
8	Research On The Key Technologies Of Multi-source Biological Data Integration And Mining
9	Topography And Geological Body 3d Visualization Research And Application
10	Bioinformatics Research For Omics Big Data