Font Size: a A A

Association Between Genetic Variation Of Lnc Rnas Based On WGCNA And GWAS And Tuberculosis Susceptibility

Posted on:2024-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YuFull Text:PDF
GTID:2544307058962919Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective Tuberculosis(TB)is a chronic infectious disease that seriously endangers the health of the population.Its pathogenic factors are not only related to Mycobacterium tuberculosis infection,but also closely related to the immune status and genetic characteristics of the body.Previous studies have shown that long non-coding RNAs(Long non-coding RNAs,Lnc RNAs)can regulate gene expression through various mechanisms,and then participate in the occurrence and development of diseases,and have attracted much attention in the study of disease pathogenesis.Recent studies have shown that genetic variation on lnc RNAs,especially single nucleotide polymorphisms(Single Nucleotide Polymorphisms,SNPs),can alter genetic susceptibility by affecting their expression.In this study,firstly,through the screening method of bioinformatics,the weighted gene co-expression network analysis method(Weighted Gene Co-Expression Network Analysis,WGCNA)was used to construct the competitive endogenous RNA(competing endogenous RNAs,ce RNA)regulation related to TB regulation.Network,further combined with the comparison of multiple genome-wide association study(GWAS)database information,screened the significant SNP site information of the core Lnc RNA(hub lnc RNA)in the ce RNA regulatory network;combined with the method of population case-control study,the screened The analysis and verification of the association between lnc RNAs and tuberculosis susceptibility will provide a research basis for in-depth exploration of candidate genes closely related to tuberculosis pathogenesis,elucidating the mechanism of action of lnc RNAs in tuberculosis,and carrying out individualized prevention and control of tuberculosis.Methods First,the expression profile data of the National Center for Biotechnology(NCBI)Gene Expression Omnibus(GEO)were analyzed.Screen the transcriptome expression profile data set including healthy people and tuberculosis patients,and use R software to analyze and screen differentially expressed DEm RNA and DElnc RNA;then construct WGCNA network analysis based on differential gene expression data,and identify positively correlated modules that are highly correlated with disease susceptibility,and Gene Ontology(GO)enrichment analysis;then based on the positive correlation module and combined with databases such as Mi Rcode,mi RDB,mi RTar Base and Target Scan,a ce RNA regulatory network related to TB regulation was constructed,and a preliminary association analysis was performed between lnc RNA and m RNA.Then LASSO regression was applied to further simplify the model,and the core lnc RNAs(hublnc RNA)in the ce RNA network were screened out;in order to clarify the candidate SNP sites on the Lnc RNA,multiple GWAS databases were used for cross-comparison,mainly including the linc SNP3.0 database and the TB population.The whole genome sequencing data(whole genome sequencing,WGS)were screened for TB candidate lnc RNA SNPs,and compared with the general population SNP information in the Chin MAP database,so as to finally determine the candidate SNP site information for subsequent experimental verification.Further combining the hospital-based case-control research method,select inpatients with pulmonary tuberculosis and healthy controls recruited from January 1,2018 to December 31,2019 as research objects,collect basic information of the research objects,and collect venous blood samples.DNA was extracted and genotyped by amplifying target loci for Sanger sequencing.The frequency and composition ratio were used for statistical description,and the χ2 test was used to analyze the genotype distribution characteristics of lnc RNA SNP loci in the case group and the control group.Logistic regression analysis was used to calculate the OR value and 95%CI of each genotype associated with tuberculosis susceptibility.ResultsThrough the differential expression identification of genes in healthy people and tuberculosis patients in the tuberculosis transcriptome data GSE54992,a total of 4199 differentially expressed m RNAs and 264 differentially expressed lnc RNAs were identified;based on WGCNA analysis,through hierarchical clustering and dynamic shearing Tree identification,and finally analyzed four modules highly related to TB susceptibility;selected modules positively related to disease,and constructed a ce RNA network relationship containing 36 Lnc RNAs,36 mi RNAs and 169 m RNAs;Ten hub-lnc RNAs with non-zero coefficients,including C20orf197,EPB41L4 A.AS1,HAR1 A,LINC00494,MIR210 HG,MYLK.AS1,PCGEM1,RGS5,SNRK.AS1,ST7.AS1,were screened out from the network;for subsequent confirmation of experimental sites,Compared with the information of multiple GWAS databases,firstly,133 associated SNPs of hub-lnc RNA were queried in the linc SNP3.0 database,and then through cross-comparison with WGS data and China MAP database,two highly significant SNPs of lnc RNA MIR210 HG rs1056812 and rs1599725 were finally determined.Sexual SNP loci for subsequent experimental verification.This study adopts the method of case-control study for crowd verification,with240 people in the case group and 210 people in the control group.The genotyping results showed that the genotype distribution of the rs1056812 locus was C/C,C/T,T/T,and there was no statistical difference between the three genotypes between the case group and the control group;the gene distribution of the rs1599725 locus was C/C,C/T,T/T,logistic regression analysis showed that CT genotypes were significantly different between the case group and the control group(OR=1.524,95%CI=1.020,2.278,P=0.040);In the dominant model,the rs1056812 CT+TT genotype was not different from the CC genotype in the case and control groups,while the rs1599725CT+TT genotype was associated with the incidence of tuberculosis compared with the CC genotype(OR=1.572,95%CI=1.065,2.320,P=0.023);in the recessive model,there was no difference between the rs1056812 TT genotype and the CC+CT genotype in the case group and the control group,and the rs1599725 TT genotype was also There was no difference between CC+CT genotypes between the case group and the control group;allele results showed that the distribution of C and T at the rs1056812 site had no difference between the case group and the control group,and T at the rs1599725 site compared with C Mutations are associated with the onset of tuberculosis(OR=1.420,95%CI=1.042,1.937,P=0.026).Conclusions By combining the research strategy of bioinformatics and experimental verification,it was found that the TT and CT alleles of the long-chain non-coding MIR210 HG rs1599725 site may increase the risk of tuberculosis in the population,and the possible mechanism is related to the regulation of the differentially expressed gene WGCNA in tuberculosis The core long-chain non-coding RNA MIR210 HG on the ce RNA network in the positive correlation module of the positive correlation module,the differential expression of MIR210 HG can be through regulating hsa-mi R-22-3p,hsa-mi R-27a-3p and hsa-mi R-20b-5p and other key Micro RNA affects the expression of WGCNA positive correlation module m RNA,and through GO enrichment analysis,it is found that WGCNA positive correlation module genes are mostly enriched in the pathways of inflammatory response,immune response and adaptive immune response,which is basically consistent with the pathogenic mechanism of tuberculosis.Therefore,This study revealed that the rs1599725 locus was highly correlated with tuberculosis susceptibility.
Keywords/Search Tags:Tuberculosis, long non-coding RNA, Single nucleotide polymorphisms, weighted gene co-expression analysis, genome-wide association studies
PDF Full Text Request
Related items