Font Size: a A A

Non-linear Correlation Test Method And Its Application Research

Posted on:2020-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:S D LiuFull Text:PDF
GTID:2430330575455712Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous advancement of sequencing technology,the cost of sequencing has been declining,the resulting biosequencing data has proliferated,and the effective analysis and mining of massive sequence data has become a huge challenge in the field of bioinformatics.The enormous value behind the massive data and data in this field has attracted the research and discussion of many domestic and foreign scholars.So far,although there have been many analysis and research of scattered sequencing data,the related algorithms proposed by these researches have different problems in the following aspects: low efficiency,low scalability,poor parallelism,and Lu Poor stickiness,insensitivity to rare mutations,and difficulty in capturing some weak interrelationships.To this end,how to design a highly robust,highly parallelized algorithm;how to design a test method that is sensitive to weak relationships and applicable to rare mutations has become a hot topic discussed by many scholars at home and abroad.This thesis focuses on how to capture the weak mutual relations and conducts an in-depth study,and designs a novel algorithm-local nearest neighbor predictive value correlation test algorithm.In addition,this paper proposes a method of “barrel-related” rare mutation correlation test for identifying a certain or a group of rare mutation sites and diseases in order to effectively identify the topic of rare mutation pathogenic sites.Relevance.The specific research work is mainly reflected in the following three aspects:1.Partial Nearest-Neighbor Prediction Correlation Test(PNNPT)is designed for the existing algorithms with low efficiency,low parallelism,poor robustness,and inability to capture some weak interrelationships.The algorithm,which shows certain advantages when identifying nonlinear interrelationships,especially for the relationship between some high-oscillation or ring-like twin functions,the PNNPT algorithm is especially advantageous.Through simulation studies and real GWAS high triglyceride pathogenicity site analysis and analysis,the algorithm can be executed with high parallelism,has strong robustness and can effectively identify many novel interrelationships.2.Many existing algorithms in the field of bioinformatics are designed to test the correlation between common mutations and diseases,but they are not applicable to rare mutations.This paper studies and analyzes a large number of identification of rare mutations and disease association.The method and literature,a new "bucket-Related Rare Variation Correlation Test"(BRRVT)method was proposed to find rare mutation sites with high morbidity.The combination of PNNPT and BRRVT was applied to the study of GWAS-identifying high triglyceride-causing sites for the discovery of novel common and rare mutation sites.3.The combination of PNNPT and BRRVT was applied to the study of GWAS to identify high triglyceride-causing sites for the discovery of novel common mutations and rare mutation sites.
Keywords/Search Tags:bioinformatics, correlation test, GWAS application, rare mutation, triglyceride
PDF Full Text Request
Related items