Font Size: a A A

Statistical Model In Bioinformatics And Its Simulations And Applications

Posted on:2017-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S Q MaFull Text:PDF
GTID:2180330485971192Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
This thesis mainly focuses on some of statistical models and its application in bioinformatics. Two improvements of the integrated model have been proposed in this thesis. Recently, it’s getting more and more popular to study bioinformatics by using statistical model. This thesis presents methods focusing on how to target significant loci which are related to complex traits and diseases, and achieves this goal by using statistical model.The first chapter of this thesis is introduction, which gives the background and the meaning of this research, and analyzes the current up-to-date study result all around the world. The first chapter gives a comprehensive introduction to the current genome-wide association study. The second chapter makes some introduction to some basic biological concept which is a prerequisite for understanding this thesis, and a comprehensive introduction to the linear mixed model is also presented in the second chapter. The second chapter paves the way for the third chapter. In the third chapter, we make simulation studies of linear mixed model. We extend the basic linear mixed model to fit to different problem and situation. Meanwhile, simulation studies are made for several different models, and these simulation studies justifies the feasibilities of different model in different situation.In the fourth chapter, a modified integrated model is proposed. The integrated model takes into consideration the correlation between the GWAS dataset and the annotation datasets to make parameter estimation. The GWAS data is gotten by the linear mixed mentioned before. We use EM algorithm to make parameter estimation iteratively, and we justify the modified integrated model by the simulation study.In order to improve the ability of the modified integrated model to deal with the high-density annotation dataset, in the fifth chapter an extensible integrated model is proposed to solve this problem. The extensive integrated model can integrate as many as possible the annotation dataset to estimate parameter and excavate information, so let more significant loci be reported by their more accurate posterior probability.By simulation studies, the modified integrated model and extensible integrated model are both can control the global false discovery rate pin to a preset value. And some other statistic, such as the area under the curve and statistical power also indicate that the proposed models are performed very well. The two proposed model are fitted to the real data, and the results are analyzed.
Keywords/Search Tags:genome-wide association study, statistical model, integrated model, linear mixed model, EM algorithm
PDF Full Text Request
Related items