Font Size: a A A

Parallel Design And Implementation Of The Population Analysis Toolkit

Posted on:2017-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:R Q JingFull Text:PDF
GTID:2350330485991397Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Along with biological data volume grows rapidly, bio-processing technology is more and more important. How to design parallel algorithms for specific biological data to improve the running efficiency is become one of the hot topics. In this paper, using Open MP technology designed and implemented the parallel algorithms, for group analysis tools. It improves the efficiency of soybean data processing.In this paper, soybean genetic variation information as the main experimental data, for Estimating F-Statistics for the Analysis of Population Structure tool, Robust relationship inference in genome-wide association studies tool and The correlation analysis tool doing research on parallel. F-statistics tool through reading the source file and two population files to match individual names in two kinds of files, to pull out the effective biological information and to calculate the relational value between different populations, then aware of the degree of differentiation between two populations. Robust relationship inference in genome-wide association studies tool, through intercept the GT value from vcf source file, to get the original data and implement some appropriate operations, then aware of the relation value between each pair of individuals. Biological original pedigree can be filled through the relation values, to make the relationship between organisms be more completely. The correlation analysis tool through reading and pre- processing the vcf original file, then comparing genotype data and get the number of SNPs heterozygous loci and homozygous loci. Finally, through the calculate of data get the correlation value between each pair of individuals. After some biotreatment to the correlation values, we can find some potential genetic inheritance possibilities.In this paper, we designed more apt to be paralleled serial algorithms. Then according to the independence of operands, we implemented the parallel algorithms. Finally, this paper compared and analyzed the serial/parallel results and execution time of each tool. Experiment results showed that using Open MP parallel technique could effectively improve the data processing and analyzing efficiency of every group analysis tool and had a great significance in processing massive bioinformatics data.
Keywords/Search Tags:Group analytic, F-statistics, Robust, Correlation, OpenMP
PDF Full Text Request
Related items