Font Size: a A A

Poplar Three-dimensional Genome Comparative Study And Analysis Platform Construction Based On Data Mining

Posted on:2022-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YangFull Text:PDF
GTID:2493306551470664Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of sequencing technology,the amount of biological macromolecule sequence data has also grown rapidly.As a technology to extract unknown,implicit and potentially valuable information from huge data,data mining technology is widely used in the field of bioinformatics to explore its biological significance.Among them,the threedimensional genome is a research hotspot in the field of genes in recent years.Studies have shown that the three-dimensional structure of the genome is related to gene transcription regulation and epigenetics.However,the comparative analysis of the three-dimensional structure of the genome among species has not been widely studied in plants.Therefore,taking the poplar as an example,using data mining technology to conduct three-dimensional genome comparative analysis between Populus euphratica and P.alba var.pyramidalis is of great significance for exploring the conservation of genome spatial structure and its influence on transcriptional regulation during plant evolution.Based on the investigation of the current situation at home and abroad,the following three scientific questions are proposed:(1)How to implement a classification algorithm of chromatin spatial structure A/B compartments based on methylation data?(2)Taking poplar as an example,how to construct a general analysis process for the comparative study of plant three-dimensional genomes?(3)How to build an online three-dimensional plant genome comparative analysis platform?Addressing the aforesaid problems,the data mining technology is used to carry out research.The work is summarized as follows:(1)Based on the A/B compartments classification algorithm pca.hic,combined with the correlation algorithm,the A/B compartments classification algorithm based on methylation data is realized,namely MDPH(Methylation data based pca.hic).MDPH directly uses methylation data that can dynamically reflect the state of chromatin to classify A/B compartments,thereby providing more accurate A/B compartments results than the pca.hic algorithm based on static gene data.(2)Based on computer methods such as genome three-dimensional structure identification software,MDPH algorithm,interval overlap algorithm,and data mining,the three-dimensional genome comparative analysis between Populus euphratica and P.alba var.pyramidalis was proposed,and a general process for the comparative analysis of three-dimensional genomes of plant relatives was proposed.This research method quantifies the effect of changes in the threedimensional structure of chromatin on epigenetics during the evolution of plants.It has certain reference significance for researchers and can accelerate the progress of research on the comparative analysis of three-dimensional genomes between plants.(3)Based on Web development technology and the Django framework,an online data calculation platform for the comparative analysis of three-dimensional genomes of plant,namely TPGCA(Three dimensional Plant Genome Comparative Analysis Platform),has been established.The TPGCA platform realizes functions such as the identification of threedimensional genome structure,visualization of three-dimensional genome structure,and comparative analysis of three-dimensional genome structure,and provides convenient and simple data processing services for researchers studying three-dimensional genome comparative analysis.
Keywords/Search Tags:data mining, three-dimensional genome, three-dimensional genome comparative analysis, online analysis platform
PDF Full Text Request
Related items