| With the vigorous development of sequencing technology,researchers have carried out the research projects related to haplotypes,including haplotype assembly,allele-specific gene expression,allele-specific long-range chromatin interaction and parental pass-through effects of natural genetic variations.These studies have explained the potential characteristics of haplotypes in tissue specificity,individual differences and population preferences.Therefore,haplotypes construction is more conducive to explore the potential regulatory mechanisms in many diseases and cancers,and also allows more researchers to understand and explore the functions of haplotypes.In recent years,researchers have relied on long reads sequencing dataset to construct haplotype blocks.However,the construction of haplotype blocks still mainly depends on the whole genome sequencing to detect heterozygous SNP sites,relying on the heterozygous SNPs to phase the blocks,and make use of the long-range chromatin interaction to link the haplotype blocks to further extend the haplotype length and complete the haplotype block construction.As a stable epigenetic modification,DNA methylation plays a key role to affect the allele-specific expressed genes.Many imprinting genes have been reported that their allele-specific expression between the parental haplotypes is strongly related to allele-specific DNA methylation.And these allele-specific DNA methylation sites can also serve as the key factor to assist in the construction of haplotype blocks.This project mainly attempts to construct haplotype blocks based on WGS,WGBS,Hi-C dataset of ten different cell lines,such as GM12878 cell line,A549 cell line and K562 cell line,combined with heterozygous SNPs and long-range chromatin interaction,and well as the allele-specific DNA methylation sites.After completing the construction of haplotype blocks,the haplotype blocks information is used to distinguish haplotypes from RNA-seq,Ch IP-seq,and Hi-C data.The association analysis between allele-specific expressed genes,allele-specific transcription factor binding,allele-specific DNA methylation and allele-specific long-range chromatin interaction was performed to explore the genes regulatory mechanism.The results in this study indicate that:1)Compared with haplotypes constructed only rely on heterozygous SNPs,the average length of haplotype blocks constructed by combining heterozygous SNPs and allele-specific DNA methylation sites in each cell line is longer,and the utilization rate of heterozygous SNPs and coverage of the haplotype blocks on the whole genome are significantly increased.2)By combining the long-range chromatin interaction,the utilization rate of heterozygous SNPs in most cell lines are higher than 95%.By selecting the haplotype blocks with more than 100 heterozygous SNPs,we found that each cell line only remains one or two longer haplotype blocks on each chromosome.However,the utilization rate of heterozygous SNPs and the coverage of haplotype blocks on the whole genome have only slightly decreased compared to filtering before.And the utilization rate of heterozygous SNPs in most cell lines is still above 90%.It indicates most cell lines in this study achieve a good result of haplotype blocks construction.3)Comparing the high-quality and widely used haplotype blocks of GM12878 cell line,K562 cell line,and Hep G2 cell line that has public published with the haplotype blocks constructed in this study and comparing the haplotype blocks rely on Pac Bio and Nanopore long reads data with the haplotype blocks constructed in this study of GM12878 cell line,the consistency of all haplotype blocks is above 97%.It shows the construction of haplotype blocks in this study are highly accurate.4)After constructing of each haplotype block,we carried out the allele-specific association analysis,and found there are a significant correlation between allelespecific DNA methylation genes,allele-specific expressed genes and the imprinting genes.At the same time,combining the analysis of Allele-specific CTCF transcription factor binding and allele-specific long-range chromatin interaction,we found that there was significant allele-specific long-range chromatin interaction and allele-specific DNA methylation between H19/IGF2 imprinting genes,MEST imprinting gene and BPRL28 gene among the haplotypes in GM12878 cell line.And the results of H19/IGF2 genes in this study were highly consistent with the regulatory mechanism that has been reported.At the same time,it was newly found that MEST imprinting gene and BPRL28 gene are also participated in allele-specific chromatin interaction and have occurred to allele-specific DNA methylation.However,the results of MEST imprinting genes and BPRL28 gene still need to further validate and its regulatory mechanism still needs further exploration.In conclusion,Allele-specific DNA methylation site could improve the result of haplotype blocks construction significantly.At the same time,after completing the haplotype block construction,the association analysis of allele-specific expressed genes,allele-specific DNA methylation,allele-specific CTCF transcription factor binding and allele-specific long-range chromatin interaction provides a reference for exploring the potential regulatory mechanism of many disease related imprinting genes,and also provides the new ideas to promote the research and diagnose of diseases caused by the imbalance of imprinting genes. |