Font Size: a A A

Combining Genomics And Transcriptomics To Investigate Zebra Evolution Model

Posted on:2019-09-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:R L SaFull Text:PDF
GTID:1360330566490874Subject:Animal breeding and genetics and breeding
Abstract/Summary:PDF Full Text Request
Zebra is one of the members of Equus,which mainly distributed in the African grassland and including three subspecies of plains zebra,grevyi and mountain zebra.Among them,there are a large number of plain zebra,and the other two subspecies have been listed on the endangered species list.There are many studies on Equus genome and evolution,but only a few studies on the zebra genome,including mitochondrial genome and a few genome re-sequencing data.At present,large-scale sequencing technology has been widely used in the sequencing and analysis of multiple mammalian genomes due to its low cost and high throughput.The genome of horse and donkey in equus was sequenced and the assembly has reached the chromosomal and sub-chromosomal level,which makes the comparison and traceability between horse and donkey genome possible.Although plains zebra has genomic sequence related studies,these read-level sequences are not enough to analyze genome-wide genetic variation and phylogenetic studies.Therefore,in order to make phylogenetic analysis and comprehensive evolutionary study of equus possible,we performed de novo sequencing on the plain zebra genome.In addition,the research on zebra transcriptional level in existing biological databases is basically blank.Therefore,we performed transcriptome sequencing on different tissues of the plains zebra,and then explored the genetic mechanism of zebra adaptive evolution through the evolutionary rate of orthologous genes.Through the two parts of the genome and transcriptome sequencing,the main conclusions are as follows:?1?We obtained 570733246 clean reads for zebra genome using Illumina Hiseq and Miseq platform.Then identified 26032374 SNPs and 687552 InDels in zebra genome comparing with horse?Equ2.0?,and they mainly distributed in the intergenic region.After assembled with Newbler,we obtained 2.36 Gb of total sequence length,and its Contig and Scaffold N50 were 43.7Kb and 1.45 Mb,respectively.The repeat content of zebra genome is 42.61% and the number of protein coding genes is 22732.?2?We inferred the differentiation time for caballine and non-caballine lineage was 9.225.5 Mya,and 6.722.6 My for zebra and donkey using homologous gene tree.The zebra genome has 872 expanded gene families and 1750 rapidly evolving genes,which were mainly enriched in zinc finger,transcription,regulation of transcription,sphingoid metabolic process,protein complex scaffold,sequence-specific DNA binding,cellular response to extracellular stimulus,regulation of T-helper 2 type immune response,regulation of transcription from RNA polymerase II promoter and transcription factor activity.?3?The affinity relationship tree using SNP from Equidae that available in public database has revealed that plains zebra clustered with quagga and bohmi,and formed zebra monophyletic branch together with Greyvi and Mountain zebra.The PSMC curve showed different population trajectory for zebra from that of horse and donkey,suggesting asynchronous and complex ecological dynamic changes in America,Eurasia and Africa.?4?Through comparing plains zebra draft genome with horse genome,2207 rearrangement events were detected,including 1664 insertion of unknown origin,2 inserted duplication,204 inversion and 337 relocation.And these rearrangement regions are rich in LINE/L1,Satellite,LTR/ERV1 and SINE/tRNA repeat types.We also detected SAT2 pl and SATEC satellites were centromere sequence of plains zebra chromosome 2 which is consistent with the centromere sequence of the horse?5?We obtained 6779534289462708 high quality reads through RNA-seq with Illumina Hiseq x Ten for 5 different tissues of zebra.After assembling with Trinity,we obtained 752562 transcripts and 482219 unigenes.The average length of them was 1127 bp and 631 bp,and the N50 was 2719 bp and 954 bp,respectively.After annotation,the number of unigenes annotated in at least one database is 243758?50.54%?.We detected 69096 SSRs on plains zebra unigenes.?6?We used FPKM>0.3 as the gene expression threshold,has screened 78181216298 expressed genes for 5 tissues,containing 23672 common expressed genes with most similar expression level in skeletal muscle and heart.The specific expression genes were most in the lung and kidney.?7?Through DEG analysis,the most differentially expressed genes were found in kidney and skeletal muscle,followed by kidney and skeletal muscle.After annotation,TNNT2 and TNNI3 were highly expressed in the heart;ALB,CYP2 D and UGT were highly expressed in the liver;TNNC2,TNNI2 and TNNI1 were highly expressed in the skeletal muscles;UMOD was highly expressed in the kidneys;HSPA18 and SFTPC were highly expressed in the lung.?8?Based on the substitution rate of orthologous genes,we detected 877 positively selected genes in zebra genome,of which 284 were significantly positively selected?P<0.05?and 270 were extremely significant positively selected genes?P<0.01?.Through GO and KEGG enrichment analysis,these positively selected genes were mainly involved in immunity,nerve,angiogenesis,ultraviolet protection and insulin secretion,which could contribute to adaptation in tropical climate.?9?Through small RNA sequencing on different tissues of zebra,we obtained 1406188916662677 high quality reads mainly distributed in 21-23 nt length range.Comparing with horse known sequences in miRBase,204 conserved miRNA and 274 miRNA precursors were obtained.Meanwhile,78 mature miRNA and 83 miRNA precursors were predicted.Known and novel miRNA all have U base preference at its first base.?10?Used TPM ? 0.1as miRNA expression threshold,moderate and high abundance miRNA in zebra tissues occupy a higher proportion.There were 127 differentially expressed mi RNAs between three tissues of zebra,including 85 between heart and liver,25 between heart and skeletal muscle,and 86 between liver and skeletal muscles.34205 target genes were predicted for 282 miRNAs,and the target genes of differentially expressed miRNAs were mainly concentrated in GO function classification,including molecular function,protein binding,cell component,cell process and metabolic process.And KEGG pathways,including Ras signaling pathway?Jak-STAT signaling pathway?Neurotrophin signaling pathway and Insulin resistance.The results of this study provide sequence data resources for studies on molecular biology of equus and lay the foundation for further research in the future.
Keywords/Search Tags:Plains zebra, Whole genome, Genome evolution, Transcriptom, Positively selected genes
PDF Full Text Request
Related items