Font Size: a A A

Database Construction And Sequence Features Analysis Of Genomic Copy Number Variation

Posted on:2011-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:L DaiFull Text:PDF
GTID:2120360305994366Subject:Genetics
Abstract/Summary:PDF Full Text Request
Objective:1. To establish a Genomic Copy Number Variation Database of Growth and mental retardation.2. To analyze the sequence features of Copy Number Variation in order to study their formation mechanism.Methods:1. Using Windows+Apache+MySQL+PHP as development platform, to establish the Genomic Copy Number Variation Database of Growth and mental retardation.2. To identify the LCRs/SDs' distributional characteristics in breakpoint regions by the web-based UCSC Genome Browser; To investigate the repeat sequence elements (SINEs, LINEs, LTR et al.) in which these CNVs occurred, the sequences flanking each breakpoint (5kp at each end) and the 504 control sequences were analysed using RepeatMasker.Results:1. Established the Genomic Copy Number Variation Database of Growth and mental retardation. The database system contained administrator Login system, database query system and database management system. It has collected 812 CNVs datas from 168 patients of Growth and mental retardation.2. Our 297 CNVs of human610-quad beadchip can be grouped into four categories:(1) proximal and distal breakpoint regions are enriched for LCRs with high sequence similarity (19/297; 6.40%), (2) proximal and distal breakpoint regions are enriched for LCRs, but with low sequence similarity (53/297; 17.85%), (3) only one breakpoint region harbours LCRs (80/297; 26.94%) and (4) no LCR lies in the vicinity of both breakpoints (145/297; 48.81%). The results of the repeat sequence elements in breakpoint regions and control sequences:Alu SINEs (breakpoint regions 314/504,62.30%; control 338/504,67.06%); MIR SINEs (breakpoint regions 191/504,37.90%; control 207/504, 41.07%); L1 LINEs (breakpoint regions 328/504,65.10%; control 344/504,68.25%); L2 LINEs (breakpoint regions 155/504,30.75%; control 171/504,33.93%); L3 LINEs (breakpoint regions 35/504,6.94%; control 37/504,7.34%); LTR (breakpoint regions 266/504,52.78%; control 253/504,50.20%).Conclusions:1. Established the Genomic Copy Number Variation Database of Growth and mental retardation. Data input by using this database is rapid, complete and reliable. Data query is convenient and convincing.2. In 24.25% of these CNVs, both breakpoint regions carried LCRs, and in 26.39% of these, their high degree of sequence similarity identified NAHR as the most likely cause of these rearrangements. LCRs at only one of the two breakpoints, as detected in 80/297, and no LCR lies in the vicinity of both breakpoints, as detected in 145/297 are unlikely to be involved in NAHR.3. Genomic instability of Microsatellite repeats may be involved in the formation of CNVs. Genomic instability do not show any significant association with other repeaat elements.4. Alu mediated NAHR may be involved in the formation of LCRs/SDs. LCRs/SDs mediated NAHR has been suggested as a possible mechanism of CNV formation.
Keywords/Search Tags:CNV, Database, WAMP, LCRs/SDs, RepeatMasker, Repeat elements
PDF Full Text Request
Related items