Font Size: a A A

Construction Of An Integrated Bioinformatic Analysis System Based On Local And WEB Resources And Preliminary Experimental Studies Of Two Novel Human Genes

Posted on:2001-12-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:C G ZhangFull Text:PDF
GTID:1100360155976263Subject:Molecular biology
Abstract/Summary:PDF Full Text Request
The human fetal liver during 22 weeks of gestation is a major site of embryonic hematopoiesis and immune development in man. Thus, theoretically, there should be a lot of important novel genes transcripted in this stage. We use large-scale cDNA sequencing this strategy to analyze the groups of genes associated with the normal physiological function of fetal liver, embryonic hematopoiesis and immune development. We have performed single-pass sequencing of about 20,000 randomly selected, directionally cloned cDNAs isolated from a fetal-liver (22 weeks of gestation) cDNA library constructed with Oligo(dT) primers. About 16,000 EST sequences and 5 I 1 full-length-insert cDNA clones were obtained. To perform large-scale analysis of the large amounts of the sequences, we constructed an integrated bioinformatic analysis system based on local and WEB resources. In addition, two novel human genes were further studied experimentally.1. Construction of an integrated bioinformatic analysis system based on local and WEB resources: Based on a personal computer and Linux operating system, the Phred/Phrap/Consed software and Blast software were used to construct a platform for batch analysis of the sequences, including identifying raw DNA sequence from chromatogram file, vector sequence removing, contig analysis(sequence assembly)., repeat sequence identifying and sequence similarity analysis. Moreover, the bioinformatic resources obtained via WEB were collected to construct a homepage for data mining procedure. Results demonstrated that this robust platform could accelerate data analysis for large-scale DNA sequencing.2. Characterization, chromosomal assignment, and tissue expression of a novel human gene belong to ARF GAP family: We have identified and characterized a novel human ADP-ribosylation factor GTPase activating protein (ARFGAP1) gene that is related to other members of ARF GAP family. The full-length cDNA for human ARFGAP1 was cloned following the identification of an EST obtained by large-scale cDNA library sequencing through a Blast search of public databases. Structurally, ARFGAPl encodes a polypeptide of 516 amino acids, which contained a typical GATA-1 type zinc finger motif (CXXCX16CXXC) with the four cysteine residues that are highly conserved among other members of ARF GAP family. The conserved ARF GAP domain may emphasize the biological importance of this gene. The gene of ARFGAP1, which contained sixteen exons ranged from 0.5kb to 9.3kb, was mapped to human chromosome 22q 13.2-13.3 using radiation hybridization and in silico. The ARFGAPl is strongly expressed in endocrine glands and testis. Interestingly, the expression of ARFGAPl in testis is about 6 folds higher than that in ovary, implying a possible role of ARFGAPl in sperm physiological function. Expression of ARFGAPl in four human fetal tissues and seven cancer cell lines were also detected.3. Human MAGE-D1 represents a new MAGE subfamily which genes are located in the Xp11.21-p11.23 region with ubiquitous expression and do not code for known MAGE antigens: MAGE(melanoma antigen-encoding gene) family are expressed in a wide variety of tumors but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing fetus. However, the discovery of MAGE-D (Cancer Res, 59:4100-4103, 1999) and MAGE-D1 (Genomics, 59:161-167, 1999} has broken this discipline since both are ubiquitously expressed in normal tissues rather than strictly in cancer/testis. Of them, MAGE-D was shown not code for any known MAGE antigens recognized by T cells. However, the knowledge about the MAGE family genes is far from being understood. Here we reported the new perspectives of human MAGE-D1 and a new group of MAGE family genes it represented. Full-length cDNA sequence of human MAGE-Dl is 2800 base pairs in size and encodes a polypeptide of 778 amino acids localized in both nucleus and cytoplasm, rather than previously reported 574 amino acids (Genomics, 59:161-167, 1999). Human MAGE-D1 is extensively expressed in cancer cell lines and ubiquitously expressed in many normal tissues. In forty-eight kinds of tumors detected, expression of human MAGE-Dl was significantly up regulated in thirteen kinds of them and down regulated in seven kinds of tumors, implicating potential diagnostic value of MAGE-D1 in clinical application. Sequence analysis of genes homologous to human MAGE-D1 revealed a new group of MAGE family, designated MAGE-D subfamily. The MAGE-D subfamily comprises three orthologous members including Homo sapiens MAGE-D 1, Rattus norvegicus SNERG-1, Mus musculus DLXIN-1 and two paralogous members including Homo sapiens MAGE-D and Homo sapiens KIAA1114. The human members of MAGE-D subfamily are all located at Xp11.21-p11.23, and form another cluster of MAGE family in chromosome X. According to the phylogenetic tree of total MAGE family, MAGE-D subfamily is an independent super-cluster, evolutionarily divergent to that formed by subfamilies MAGE-A, -B and -C. MAGE-D subfamily represents a new class of MAGE family that are expressed ubiquitously in normal tissues as well as intumors and do not code for any known MAGE antigens recognized by T cells. Possible function of the MAGE-D subfamily was discussed according to the study of trophinin(Genes Dev, 9(10): 1 199-12 10, 1995), which is a segment of KIAA1114. In conclusion, MAGE-D subfamily stands out from other subfamilies MAGE-A, -B and -C of MAGE family in view of typical features such as chromosomal localization, exon/intron organization, absence of tumor-specific antigens, ubiquitous expression, subcellular localization and evolutionarily divergent sequences.In conclusion, the construction of the platform for integrated bioinformatic analysis was proven to be a powerful tool for large-scale EST analysis. Similar to the end of structural genome project and the just start of functional genome project, strong effort should be devoted to the experimental study of lots of novel genes to determine their function.
Keywords/Search Tags:human fetal liver, large-scale cDNA sequencing, bioinformatics, personal computer, Linux operating system, WEB, Phred/Phrap/Consed software, Blast software, ARF GAP family, human ARFGAP1, Zinc finger, MAGE family, human MAGE-D1, MAGE-D subfamily
PDF Full Text Request
Related items