| Alzheimer’s disease(AD)is a neurodegenerative disease with an irreversible course.As a worldwide medical problem,AD has become a serious public health issue,bringing suffering to patients and greatly increasing the burden on society.Along with the aging of the population,the proportion of the elderly population will only become larger.The twin experiment has demonstrated the high heritability of AD disease and promoted the genomic study of AD,but the risk genes that have been mined so far are still far from fully explaining the heritability of AD,and there is a problem of insufficient genetic explanation rate.Therefore,unlocking the genetic code of AD to achieve early diagnosis and intervention is undoubtedly of great importance to human health.In this paper,based on the ADNI dataset,we explore the relationship between the interaction between tiny main effect loci and the missing genetic explanation rate based on AD as the research object.With the goal of mining loci with weak main effects,significant interactions and high phenotypic explanation rates,we study and design interaction calculation methods to further complement the explanation of genetic mechanisms of AD and contribute to the realization of early intervention in AD.First,data processing was performed on the ADNI dataset.A genotype data quality control process was developed based on minimum allele frequency,SNP detection rate,Hardy-Weinberg equilibrium and other indicators.Combining AD pathological hypotheses and phenotypic characteristics,two types of quantitative phenotypes,T-tau/Aβ42 and P-tau/Aβ42,were constructed and quality control was completed based on the principles of baseline consistency and normal distribution to obtain a uniform sample space under different phenotypes and provide data for subsequent studies.Second,a genome-wide SNP-SNP interaction study was conducted based on the hypothesis of "significant interaction between small main effect loci",and a multiple linear regression model was designed to test the statistical significance of the interaction term.To address the problems of"combinatorial explosion and large computation" of genome-wide interaction detection,we designed a parallel detection method based on GPU fine-grained,and completed the genome-wide SNP-SNP interaction detection about 160 billion times in a short time.We also conducted secondary validation and comparative analysis of the experimental results,and demonstrated the improvement of the interaction on the phenotype interpretation power.Finally,the gene-gene interaction assay was designed to alleviate the multiple calibration problem of genome-wide SNP-SNP interactions and to improve the statistical power and conduct biological significance analysis.For the multiple correlations between multiple SNP-SNPs within gene-gene interactions,a gene-gene interaction correlation coefficient matrix was constructed,and an intra-class screening based on the significance of the interaction and a weighted integration method based on the correlation coefficient matrix and the number of features were proposed to calculate the gene-gene interactions.Multiple enrichment analyses were performed for genes with significant interactions based on pathway and disease information,pointing to several biological functional processes related to AD pathology,such as synapses and neurons,to verify the relevance of experimental results to AD.In summary,this paper designed a computational approach to detect SNP-SNP and gene-gene interactions at the genome-wide level,starting from exploring the interactions between minor main effect loci,and conducted experiments on multiple phenotypes of AD diseases.The experimental results both replicated some known AD risk genes and identified new outcome genes,providing new ideas and references to compensate for the genetic deficiencies of complex diseases through interaction studies. |