| After the whole-genome sequencing in humans and model organisms is that non-coding DNA makes up the majority of the most eukaryotic genomes. Transcription factors recognize degenerate families of short sequences to regulate the expression of genes during their development and throughout life. The serum response factor is one of the well studied regulatory factors. It is expressed throughout the body and throughout development. It mainly regulates the genes that effect the development of the living things. And the serum response factor is one of the valuable transcription factor because of its important role in gene expression. While many studies of cis-elements CArG bound by serum response factor are in progress, little is known about the sequence character and evolution original of the functional CArG elements.To present, there are total of 390 CArG elements experimentally validated in the mammalian. Firstly, we used a validated CArG dataset to calculate the distance distribution of functional CArG elements around the TSS. Distances between adjacent CArGs were also analyzed. We compared these distributions to those derived using a control set of randomly selected CArGs (that were not experimentally validated for function). Secondly, the analysis of the CArG sequence compared to the background sequence was performed to discover the sequence character of functional CArG elements. Thirdly, to find the dominant haplotypes in the mammalian, the computer program was run to scanning all the functional CArG elements in the mouse and human genome to find all the haplotypes. Finally, In order to find the evolution origin of the CArG elements, the functional CArG elements were compared between the mouse and human through the orthologous analysis.Our results show that 71% functional CArG elements exist upstream to the annotated TSS, with copy number increasing as one move closer to the TSS. And the distribution of the functional CArG elements upstream to the annotated TSS followed the negative power function. Moreover, the average number of the CArG-like elements in the CArG-containing genes is significantly more than that in the control genes. However, when the copy number of the two sets is almost even, the distance between adjacent elemens showed no bias between the functional ones and control ones. Through the analysis of the CArG sequence compared to the background sequence, we discovered that the substitution rate within the functional CArG elements was slower than that of the background DNA in both genomes. The results showed that the functional CArG evolved more slowly than that of the the background DNA because of function. However, core sites of the functional CArG elements evolved faster than that of background DNA. This may hint that these two sites under positive selection. And the core region of the functional CArG showed an obvious TATA bias sequence. And this sequence character of functional CArG may contribute to the formation of SRF binding with the CArG elements. The haplotype analysis of these data showed that the sequence of CArG elements is significantly diverse within species in both genomes. Moreover, the dominant haplotype of the CArG elements is not totally the same in the two genomes. To verify this finding, orthologous studies in the promoter region of CArG-containing genes was performed resulting in that most functional important to be perfectly conserved between the two genomes. And through this analysis 22 orthologous pairs of human and mouse were found. This result provides that functional important CArG elements are conserved between relatively distant speciece.We have performed the first genome-scale analysis of CArGome in mammalian through genetic and evolutionary way. The studies provided here revealed the sequence character of CArG elements and extended earlier bioinformatic analyses of functional CArG elements. In this study, we do reveal important pattern of sequence characteristics within functional CArG elements, and it will be a great help to take these into account in predicting the candidate CArG elements and attempting to distinguish the functional CArG elements through alignments. The results presented here provide a platform to study the cis-element through a genetic and evolutionary ways in the mammalian. Better understanding of the sequence characteristic of different classes of cis-elements will fundamentally affect all future developments in cis-element prediction, analysis of gene expression and regulatory determinant pattern detection. |