Font Size: a A A

Research On The Complexity Of Virus-encoded Transcription Products Based On Next-generation Sequencing Data

Posted on:2023-03-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z N CaiFull Text:PDF
GTID:1520307334472924Subject:Biology
Abstract/Summary:PDF Full Text Request
The genomes of viruses are much smaller than that of cellular organisms,yet they can encode transcription products efficiently.Except for m RNA,viruses can encode other types of transcription products,including lnc RNAs,circ RNAs,mi RNAs and si RNAs.However,most studies only investigate a specific type of virus or transcription product.There is still a lack of systematic study towards the virus-encoded transcription products.This thesis investigated the complexity of virus-encoded transcription products by analyzing the virus-infection-related next-generation-sequencing(NGS)data.The main findings of the thesis were listed as follows:1)Systematic identification and characterization of virus circ RNAs.We have performed a systematic survey of 11,924 circ RNAs from 23 viral species by computational prediction of viral circ RNAs from viral-infection-related RNA sequencing data.Besides the ds DNA viruses,our study has also revealed lots of circ RNAs in single-stranded RNA viruses and retro-transcribing viruses.Most viral circ RNAs had reverse complementary sequences or repeated sequences at the flanking sequences of the back-splice sites,which suggested that both flanking short reverse complementary sequences and repeated sequences might be crucial for the biogenesis of circ RNAs in the virus.Most viral circ RNAs only expressed in a specific cell line or tissue in a specific species.The viral circ RNAs from ds DNA viruses were observed to be heavily involved in KEGG pathways associated with cancer,which suggested that viral circ RNAs in ds DNA viruses may play a significant role in cancer.All viral circ RNAs presented in the current study were stored and organized in Virus Circ Base,which is freely available at http://www.computationalbiology.cn/Viruscirc Base/home.html.2)Identification and characterization of circ RNAs encoded by MERS-Co V,SARS-Co V-1 and SARS-Co V-2.Data mining of viral-infection-related RNA sequencing data has resulted in the identification of 28,754,720 and 3,437 circ RNAs encoded by MERS-Co V,SARS-Co V-1 and SARS-Co V-2,respectively.Moreover,the majority of the viral circ RNAs exhibit expressions only in the late stage of viral infection.The viral circ RNAs regulated genes involved in diverse functions including cancer,metabolism,autophagy and viral infection in the late stage of MERS-Co V infection.While in the late stage of SARS-Co V-2 infection,its viral circ RNAs regulated genes involved in metabolic processes of cholesterol,alcohol,fatty acid,and cellular responses to oxidative stress.3)Development of a novel method named vs RNAfinder for identifying high-confidence virus-encoded small RNAs(vs RNAs)from small RNA-Seq(s RNA-Seq)data.The vs RNAfinder outperformed two widely-used methods namely mi RDeep2 and Short Stack in identifying viral mi RNA,si RNA and pi RNA.It can also be used to identify s RNAs in animals and plants.Based on vs RNAfinder,a total of 19,734 high-confidence vs RNAs including 2,746 mi RNAs were identified in 64 viral species.It was found that the ability of a virus to express vs RNAs varied in viruses and hosts.The vs RNAs showed strong expression specificity,dynamics and little conservation,which were consistent with viral circ RNAs.4)Integration and analysis of multiple types of virus RNA products.We obtained the most comprehensive data of virus-encoded RNAs for 110 viruses,including 3,385 m RNAs,469 lnc RNAs,52,214 circ RNAs and 20,706 s RNAs,based on multiple kinds of NGS data and experimentally-validated virus RNA products.It was found that the ability of a virus to express RNAs varied in types of RNA.Some viruses with large genomes could express a large number of m RNAs,but only a small number of circ RNAs and s RNAs.The location preference of RNAs along the viral genome had obvious preference in genome location in some viruses,and the expression of circ RNAs and s RNAs of some viruses had significant correlation in genome location.The analysis of RNA interactions showed that the preference of interaction sites between viral mi RNAs and different types of viral RNAs was different.When exploring the influence of alternative splicing on RNA interactions,it was found that most of the alternative splicing detected in viruses changed the interaction sites between viral m RNAs or lnc RNAs and viral mi RNAs,which might help viruses evade the regulation of viral mi RNAs.In summary,this thesis systematically described the characteristics and potential functions of different types of virus-encoded transcription products,and developed a tool named vs RNAfinder for the identification of virus-encoded s RNA.The study could not only deepen our understanding towards the complexity of viral transcription products,but also provide data resources and tools for further studies of virus-encoded transcription products.
Keywords/Search Tags:Virus, Bioinformatics, High-throughput sequencing, Transcriptome, Circular RNA, Small RNA, Long non-coding RNA
PDF Full Text Request
Related items