Font Size: a A A

Applying Of Bayesian Network In The Analysis Of Clinical Data And Microarray Gene Expression Data

Posted on:2008-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:H T YuFull Text:PDF
GTID:2144360218959013Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Recently many researches indicate that the cancer is multi-factors disease. Not only it relates to the environment and character of the patients, but also it is a gradual accumulated invert disease involved in multi-genes. In the beginning, the change of the disease takes place on the gene level, so we can analyze the cancel combining clinic data and microarray gene expression data, then provide the explainable reason for the cancel forest, cure effect and cancel preventing. The paper introduces the Bayesian network to clinic and microarray data analysis, then describes the influence of each factor from the angel of statistics, and discloses the multiple and multi-layer cause and effect among multi-variables.The paper introduces the basic concept and classification of the Bayesian network, then the paper demonstrates the structure study, parameter study in detail, and the optimized algorism associated with the decision tree algorism. Finally construct the Bayesian network using the FullBNT tool box of Matlat 7.0.In the instance analysis, the paper analyzes the clinic data of 1441 cancel patients, creates the Bayesian network module including 48 variables and 71 direct arches. The original data include 5 parts, each part has several variables. The module discloses the relationship of these variables. In the network study, the author discusses how to set the max parent node number and the Bayesian network evaluation ability for probability of little-probability event. Trough setting the one fifth missing value randomly, the paper researches the processing ability of EM algorism, which evaluates the Bayesian parameter, for the missing value data.From the gastric cancer clinic data analysis, the paper evaluates the Bayesian study effect on whether uses the Chi-square test. Analyze the clinic data including 122 gastric cancer patients. Create the Bayesian network module which includes 4 variables and 5 direct arches. Demonstrate the relationship among the variable of gastric state, lymph node Metastasis, peritoneal dissemination and depth of invasion. Discuss how to use the module to diagnose the disease.For the micro angel, the paper discusses in detail the whole process to construct the microarray data Bayesian network model. It will inevitably generate the gene expression missing data in the microarray crossbreed experiment operation process, the gastric gene expression data have many missing data as well, so this paper uses KNN algorism to fill in the missing value. The gene expression data are continuous data in general, though we also can use the continuous data in the Bayesian network, but the actual meaning is not clear, and the discredited data can increase the network study precision. So the paper discretizes the gene expression data byμ±σmethod, then constructs the Bayesian network, finally construct a Bayesian which includes 37 genes and 35 direct arches. And analyze the usage and influence of two nodes which have many children node, in network.Through following experiment comparing, the page gets the following: 1 It is reasonable to analyze the microarray data for the clinic data by the Bayesian network method. It can describe the relationship of variables from the angel of statistics, then discloses the multiple and multi-layer cause and effect and directs the clinic diagnose and cure. 2 The Bayesian network has the strong ability to process the missing value. It can get more accurate network parameters by studying the missing value data. 3 Solve the node sort issue and improve the study efficiency by studying the Bayesian network algorism which is associated with decision tree to optimize the network structure study algorism. 4 Create the microarray data Bayesian network model construction process. Use the KNN algorism to fill in the missing value, then use theμ±σtri-discretized method to discretize the expression data. Then improve the speed of the Bayesian network study and the ability to explain the outcome. 5 The Bayesian network is weak in evaluating the little-probability event, so it is not good to make the classification complex.Totally, as one effective method of data mining, the Bayesian network has the good theory and clear knowledge expression form. We can construct the model by inducing it to analysis of clinic data and microarray experiment data. Then analyze the clinic variables and the influence of genes. It can apply to research gastric cancer, observe the gene expression change caused by cancer cure, discuss the conclusion factor of cancer cell cure and etc.
Keywords/Search Tags:Live cancer, Gastric cancer, Gene chip, Bayesian networks, Structure learning, Parameter learning
PDF Full Text Request
Related items