Font Size: a A A

Identification Of High-Grade Serous Ovarian Cancer Targeting Hub Genes Based On High-dimensional Co-expression Network Analysis

Posted on:2024-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiFull Text:PDF
GTID:2544306923456414Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Background:Ovarian cancer is a serious gynecologic malignancy that poses a significant threat to women’s health.Globally,ovarian cancer is the eighth most common cancer among women.In 2020,there were over 313,000 new cases of ovarian cancer worldwide.In recent years,the global trend of ovarian cancer has shown a gradual increase,mainly due to changes in lifestyle,environmental pollution,genetic factors,and delayed marriage and childbirth,among other risk factors.The incidence of ovarian cancer in China has been increasing year by year in the past few decades,and the number of young women with ovarian cancer is also on the rise.Therefore,it is urgent to identify the oncogenic genes involved in the development of ovarian cancer.Traditional wet lab methods are time-consuming and expensive,and the high inter-individual heterogeneity of ovarian cancer makes the experimental scope broader than other cancers.Differential gene screening based on wet lab experiments cannot meet the needs of clinical research.In recent years,the rapid development of single-cell transcriptome technology has allowed for disease heterogeneity mining at the resolution of single cells.At the same time,corresponding bioinformatics analysis algorithms for single-cell transcriptome data have gradually matured.However,traditional bioinformatics methods based on gene-disease association analysis cannot exclude the interference of confounding factors,resulting in a high false positive rate for the genes identified.Moreover,few genes act alone,and almost all genes need to achieve specific functions in biological networks.Studyingthe correlation of individual genes alone may lead to the loss of key information.High-dimensional weighted gene coexpression network analysis(hdWGCNA)can reveal the common network regulatory patterns and functional pathways among genes and identify hub genes associated with diseases,taking into account the high-dimensional sparsity of single-cell data.Targeted Maximum Likelihood Estimation(TMLE)is a doubly robust causal association analysis method based on maximum likelihood estimation,which is more robust than non-substitute estimators in handling outliers and sparsity in data.TMLE can select key causal genes from hub genes associated with diseases,taking into account the characteristics of single-cell sequencing data.Objective:This study utilizes single-cell RNA sequencing data to reveal the unique advantage of disease cell heterogeneity.Based on the high-dimensional co-expression network method,key hub genes of ovarian cancer at different clinical stages are identified,and critical causal genes are then selected through TMLE.These findings provide important references for experimental validation of key differential genes at different stages of ovarian cancer,exploring the pathogenesis of ovarian cancer,and developing targeted therapeutic drugs.Methods:In this study,cancer tissue samples were obtained from three ovarian cancer patients and subjected to single-cell RNA sequencing using 10X Genomics.In addition,singlecell sequencing data from 12 Chinese ovarian cancer patients from the GEO database(GSE184880)were collected,resulting in a total of 94,365 cells and 36,601 expressed genes.Cells from cancer tissue were defined as the "case" group,while cells from non-malignant ovarian tissue were defined as the "control" group.A gene expression matrix was constructed with 94,365 rows and 36,601 columns,with the cells as sample units and the genes as variables.Standard bioinformatics analysis was first performed on this large data matrix.Single-cell RNA sequencing data were integrated and pre-processed to remove batch effects,followed by cell clustering and annotation.Enrichment analysis was used to identify key gene modules involved in ovarian cancer.The hdWGCNA method was used to analyze differential hub gene modules between cells derived from cancer tissue and non-malignant ovarian tissue within cell populations.The TMLE model was then used to further screen for causal hub genes involved in ovarian cancer development.For potential causal hub genes,enrichment analysis and database analysis were used to identify signaling pathways that may contribute to ovarian cancer pathogenesis.Results:1.Fifteen single-cell data samples from cancer tissue were analyzed using bioinformatics methods,resulting in 27 cell clusters.High-dimensional weighted gene co-expression network analysis was performed on the epithelial cell cluster,resulting in 17 differential gene modules.Key signaling pathways,such as protein depalmitoylation and RNA methylation,were discovered,and the top 30 hub genes were selected for each module.2.The Targeted Maximum Likelihood Estimation model was used to screen for causal genes involved in ovarian cancer development.In the key co-expression module M1,25 potential causal hub genes were identified for the transition from benign tumors to stage Ⅰ highgrade serous ovarian cancer,18 for the transition from stage Ⅰ to Ⅱ,and 24 for the transition from stage Ⅱ to Ⅲ.Fourteen potential causal genes were comrrion to all three stages,with seven of them being clear ovarian cancer marker genes.The gene with the greatest causal effect was CST3.3.Enrichment analysis was performed on all potential causal hub genes in the 17 differential gene modules.For the transition from benign tumors to stage I,potential causal genes were enriched in signaling pathways such as eukaryotic translation elongation,cellular response to stress,neural system development,axon guidance,SLITs and ROBOs expression regulation,eukaryotic translation initiation,and cellular response to hunger.For the transition from stage Ⅰ to Ⅱ,potential causal genes were enriched in signaling pathwayssuch as cellular response to chemical stress,extracellular matrix organization,regulation of apoptotic signaling pathways,TGF-β signaling pathway,cellular response to hunger,and cytokine signaling in the immune system.For the transition from stage Ⅱ to Ⅲ,potential causal genes were enriched in signaling pathways such as SUMOylation targets,negative regulation of transcription factor activity by DNA binding,cancer pathways,nuclear receptor pathways,epithelial cell tube morphogenesis,and regulation of protein serine/threonine kinase activity.Conclusion:1.Conventional bioinformatics gene expression-disease association analysis methods can only obtain the correlation between a single gene and exposure,and typically involve a large number of candidate genes.Considering disease heterogeneity,this is not conducive to subsequent experimental validation.This study,based on conventional analysis,utilized highdimensional weighted co-expression networks to obtain key hub genes and then used TMLE to screen for potential causal hub genes at different stages,effectively narrowing the range of differentially expressed genes,improving experimental efficiency,reducing experimental costs,and providing methodological and algorithmic support for causal gene analysis in single-cell genomics.2.Through analysis of potential causal hub genes shared by different stages in differential gene modules,this study identified two genes,ATD5D and TNFSF10,which may be of significant importance in the development of ovarian cancer,providing a basis for further exploration of the mechanisms underlying ovarian cancer development and the search for new drug targets.3.This study found that among the 25 potential causal hub genes involved in the transition from benign tumors to stage I high-grade serous ovarian cancer in module M1,10 genes have been shown to be related to cancer in previous literature,and these genes may lead to tumor progression through eukaryotic translation elongation and other pathways.Among the genes from stage Ⅰ to Ⅱ,six genes have been confirmed in previous studies to be related to cancer and may cause further deterioration through regulation of apoptotic signaling pathways and other pathways.Among the potentialcausal genes from stage Ⅱ to Ⅲ,eight genes have been experimentally proven to be related to ovarian cancer and may lead to tumor progression through oxidative phosphorylation reactions and other pathways.This study has important reference significance for exploring the mechanisms underlying the development of ovarian cancer and for precision diagnosis and treatment.
Keywords/Search Tags:causal inference, Targeted maximum likelihood estimation, Ovarian cancer, Single cell RNA sequencing, Co-expression network
PDF Full Text Request
Related items