Font Size: a A A

Research On The Strategy And Method Of Constructing The Regression Model Of Omics Interactive Network

Posted on:2020-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:W Q LinFull Text:PDF
GTID:2404330572984234Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Complex diseases are the result of a combination of intergroup biomarkers,but it is not only the simple superposition and the accumulation between them.These different levels of omics markers often follow the genome-transcriptome-protein-metabolomic-phenotypic,time continuous spectrum carrier of biological information flow mechanism,mutual interlocking into a omics network systems and it controls the process of the occurrence,development and outcome of the disease.The multi-level omics markers of complex diseases have the characteristics of"high dimension"and"network".The traditional strategies of omics markers screening and methods mostly ignore these two characteristics,which will inevitably result in the loss of information and even the emergence of wrong results.At the same time,it is unsuitable to use the simple linear correlation to describe the relationship between biological molecules which presents different non-linear patterns.In medical network framework,we firstly introduced pointwise mutual information to characterize"side effect"in the omics interactive network,and based on two-dimension kernel density estimation to estimate value of the pointwise mutual information,and then built omics interactive network regression model,which integrated"point effect","side effect",which based on the pointwise mutual information.Statistical simulations were used to evaluate the type ? error and statistical power of the model under different sample sizes,different"side effect"patterns and different network structures,and the real transcriptome data of lung cancer further verified the practicability of the model.Method:Because of complex regulation relationships between biological molecules,in simple linear correlation that statistics commonly uses is difficult to depict the complexity of this regulation,pointwise mutual information can not only measure the linear relationship between two variables,but also can measure the level of nonlinear correlation between two variables.In this study,firstly,pointwise mutual information was introduced to represent the correlation between different network nodes in omics biological network,that is,the"edge effect"of the network,and the kernel density estimation method was used to estimate the value of point mutual information between the nodes of the two omics networks.Further under the logistic regression model,building pointwise mutual information-based omics network regression model(PMINR)to identify the entire group interaction network,the omics biomark or correlation between biological molecules(such as relationship between gene expression in the network control,etc.)is associated with the occurrence of complex diseases.In this study,two simulation schemes were designed to evaluate the effectiveness and scientificity of the model:1.When the network structure of the omics interactive network was fixed,that is,when the differential nodes and edges in the network in omics biological network were fixed in each simulation;2.The differential nodes and differential edges in the biological network were randomly assigned.Two different simulation scenarios are considered under each simulation scheme:(1)the correlation between nodes in the network is linear;(2)the correlation between nodes in the network is non-linear.Further,four network differences were set under each simulation scenario:(1)only nodes in the network have differences;(2)there are only differential edge in the network;(3)there is differential between nodes and edges in the network,and the differential edge are connected with the differential node;(4)node and edge have difference in the network,but differential edge and differential node are not connected.This series of simulations can be used to compare the advantages and disadvantages of the pointwise mutual information-based omics network regression model in this study with the product moment-based omics network regression model(PMNR)which is commonly used in the biological information in terms of the rate of type I error and test efficiency from multiple perspectives.Result:The study of the simulation results show that:1.the correlation between the two nodes in the linear case,two models under the different situation can control the type ? error well,steady at around 0.05,PMINR and PMNR have similar power on difference node identification,but compared with PMNR,PMINR recognition on the differential edge is relatively weak;2.In the case that the correlation between two nodes is non-linear,PMNR has low efficiency in the identification of differential nodes.In the identification of differential edges,PMNR can hardly identify the differential edges,while PMINR can identify the differential edge well and also has a good control level of type ? error.Gene expression data of 187 smokers from the GEO database were used to further validate the usefulness of the models.The point mutual information network regression model identified three loci(AKT2,BAD and JAK3),and the product distance term network regression model identified two loci(BAD and JAK3).Meanwhile,the pointwise mutual information network regression model identified four different edges(RAF1-MAP2K1,ERBB2-TGFA,CASP9-AKT2,PIK3CD-EML4),and the product moment network regression model failed to identify the different edges.The previous research results show that these identified differential nodes and differential edges have good biological explanations,which further explains the strong practicability of the model proposed in this study.Conclusion:Pointwise mutual information can better extract different patterns of"side effect" in omics interaction network,under the setting of different simulation conditions,the omics network regression model based on the pointwise mutual information all has the very good controlling performance of type ? error,and whether there is linear or nonlinear relationship between nodes in network,PMINR is able to identify different nodes and edges well,and PMINR's recognition has good robustness under the different network structures.
Keywords/Search Tags:Network medicine, Omics interactive network, Pointwise mutual information, Regression model, Complex diseases
PDF Full Text Request
Related items