Font Size: a A A

Lung Cancer CircRNA Bioinformatics Analysis And Database Development

Posted on:2019-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y F XiaFull Text:PDF
GTID:2370330545996379Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
circRNA is a kind of special closed-loop RNA molecule and its expression level is relatively low.It has long been considered as a ?noise? and has not been taken seriously.Until recent years,with the rapid development of sequencing technology,circRNA has finally been discovered on a large scale;and because of its special loop structure and complex biological functions,it has become the focus of transcriptomic research at this stage.However,at present,both the research on the circRNA's structure,molecular level and its loop formation mechanism,as well as the analysis of its biological function,are at a very preliminary stage.Lung cancer,as the cancer with the highest morbidity and mortality rate in the world,has always been a problem in the medical field.It has always been a problem in the medical field.In recent years,the research results obtained by circRNA in various cancers have also brought new breakthroughs in lung cancer research.In this paper,1,029 RNA-seq dataset from lung cancer patients and control groups were used to identify and screen out a large number of lung cancer-related circRNAs.The bioinformatics analysis of its structure and function was conducted to find the structural features,improve the relevant annotation information,and construct a circRNA database for lung cancer.We selected good quality RNA-seq data of the lung cancer patients and control groups from the NCBI,and used script to integrate three circRNA identification tools: find_circ,CIRCexplorer2 and CIRI,for identification of circRNAs.Finally,19,397 circRNAs with higher credibility were obtained.Among them,gene MAN1A2,ZC3H6,SLTM,RSRC1,and RNF168 are the most frequent host genes.Through GO,KEGG enrichment analysis of these host genes,they were found to be mainly enriched in the cell cycle,transcriptional regulation,and various cancers,including non-small cell lung cancer;The circRNA-miRNA interaction network analysis was performed on circRNAs with high expression levels and found that most circRNAs and cancer-associated miRNAs interacted with each other.This also shows that the circRNAs we obtained have a high correlation with cancer.In addition,by analyzing the characteristics of circRNA,we found that the chromosome length and the density of chromosome intron Alu elements can influence the production of circRNA to a certain extent;in most cases,a single host gene only produces 1-3 circRNAs.Considering that the correspondence between host gene and circRNA may be related to certain specific positions of host gene,this paper makes a statistical analysis of the "hot spots"(ie,the site shared by multiple circRNAs)of circRNA.As a result,it was found that "hot spots" are widely present in circRNA,about 43% of the circRNAs we obtain are "hot spots".We also discovered unexpectedly that the number of "hot spots" at the start site of circRNA is far greater than the "hot spot" at the end site of circRNA(about 6 times),this phenomenon may be related to some undiscovered circRNA loop formation mechanism.Finally,we uniformly classify and integrate the basic information(including chromosome locations,lengths,annotations,etc.)of the circRNAs associated with lung cancer,and developed database and website to provide functions such as searching,browsing and downloading of lung cancer-related circRNAs.
Keywords/Search Tags:circRNA, lung cancer, database, RNA-seq, loop formation mechanism
PDF Full Text Request
Related items