Genetic Risk Prediction For Complex Traits With Genome-wide Data

Posted on:2016-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:W W Duan

Full Text:PDF

GTID:2284330461496560

Subject:Epidemiology and Health Statistics

Abstract/Summary:

PDF Full Text Request

Genome-wide association study(GWAS) has been recognized as a robust tool for discovering the importance of genetic factors in complex traits. By the end of October 2014, about 19,602 single nucleotide polymorphisms(SNPs), which are associated with 1,251 traits, had been detected. But for anyone trait, only minority SNPs can pass the multi-level validation of GWAS and explain small proportion of heritability. Many researches suggest that the validated SNPs from GWAS perform low power in genetic risk predictions of some traits, and neglecting a mass of low effect SNPs has been regard as a main reason for that. So, how to make use of the information in GWAS has been the key of success. Recently, two excellent strategies had been raised: one is specifying a loose hypothesis test level, and another is prediction with all SNPs by a linear mixed model(LMM). Base on the two strategies, we propose two methods: s GRS and s GRS-LMM and assess the performance of them in genetic risk prediction of complex traits.There are two purposes in this research: firstly, the prediction accuracy of s GRS and s GRS-LMM will be compared with that of other methods; secondly, some underlying factors affecting the prediction accuracy will be discussed.In this study, some simulation trials are used for comparing the prediction performance of BLUP, AM-BLUP, w GRS, RF, s GRS and s GRS-LMM, and then we apply these methods to a real GWAS data of non-small cell lung cancer(NSCLC) in Han Chinese population. The main contents of this study read as follow:1. Simulations based on Chromosome 1: In the simulations, we use the genotype of Chromosome 1 in the real GWAS data and the quantitative phenotype and binary phenotype are generated with simulations by setting some different parameters: sample size, heritability, number of risk loci and population prevalence. Then the six methods will be applied to the simulation data. 2. The real data analysis: We apply the six methods to a real GWAS data of NSCLC in Han Chinese population. As a train data, Nanjing Population are used to build prediction models of the six methods; As a test data, Beijing Population are used for evaluating the prediction accuracy of the methods above.The main results of this study are as follow:1. Results of simulation trials: In most simulation conditions, the prediction accuracy of s GRS and s GRS-LMM are better than the others; Sample size, heritability, number of risk loci and population prevalence all have impact on the prediction accuracy of the six methods; Quantitative and binary phenotype in the six methods have similar trends.2. Results of the real data analysis: The prediction accuracy of s GRS and s GRS-LMM are better than the others, and the value(AUC=0.735) of s GRS-LMM is highest in all methods. There is large gap between the value of s GRS-LMM and the theoretical prediction accuracy.Conclusion: All Results of simulation trials and real data analysis suggest that sGRS and s GRS-LMM are effective in genetic risk prediction of GWAS data.

Keywords/Search Tags:

Genome-wide association study, Genetic risk prediction, Genetic risk score, Linear-mixed model, Non-small cell lung cancer

PDF Full Text Request

Related items

1	Analysis On Risk Prediction Model For Complex Diseases And Data Mining On Genetic Variants
2	Genome-wide Trans-ancestry Meta-analysis Identifies Novel Susceptibility Loci For Non-small Cell Lung Cancer
3	Genome-Wide Association Studies Of Genetic Variants For Susceptibility To And Clinical Outcomes Of Esophageal Squamous-Cell Carcinoma And Small-Cell Lung Cancer
4	Association Between Common Genetic Variations And Early-onset Gastric Cancer Risk
5	Genetic Polymorphisms And Lung Cancer Risk:Evidence From Meta-analyses,and Genome-wide Association Studies
6	Study On Risk Prediction Model Of Lung Cancer In Chinese Population Based On Genome - Wide Association Study
7	Clustering by genetic ancestry using genome-wide single nucleotide polymorphisms and incorporating genetic ancestry into genetic risk prediction models
8	Genome-wide Scan Of The Effect Of Genetic Variants On Prognosis In Gastric Cancer Patients
9	N6-methyladenosine (m6A) Related Genetic Variants Are Associated With Lung Cancer Susceptibility
10	Genetic Score Calculation And Risk Prediction Model Construction Of Bladder Cancer