Font Size: a A A

Design And Implementation Of Operation Acceleration Scheme For Gene Data Analysis And Processing Software BQSR

Posted on:2020-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:X ShaFull Text:PDF
GTID:2370330590483057Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Precision medicine is an emerging disease diagnosis method that can diagnose the cause from the genetic level.Its rise is inseparable from the development of genetic data analysis and processing technology.The gene data analysis and processing software BQSR is an important part of the analysis and processing of gene data,which is used to correct the base quality score and has a crucial influence on the accuracy of detecting the mutation site.Due to the large scale of genetic data,the BQSR software commonly used in the industry requires tens or even hundreds of hours to correct the base quality score of the whole genome data,which greatly affects the timeliness of disease diagnosis.Therefore,this paper aims to implement a set of operational acceleration solutions for the time-consuming bottleneck of BQSR.The BQSR's running acceleration solution mainly involves two aspects of IO and computing.In the IO aspect,the IO thread and the computing thread are parallelized;The number of sequences processed by each batch of the program is reduced to reduce the memory pressure;the output data compression coding time is shortened.In terms of computation,eliminating synchronization locks improves the concurrency of multithreading;refactoring code speeds up the BAQ algorithm module;speeds up the data caching mechanism and indexing mechanism of the program;improves base context coding calculation by using adjacent base context overlapping information speed.After implementing the acceleration scheme,this paper tests each acceleration module through three different types of data sets,tests the performance improvement brought by the optimization unit,and optimizes the consistency of the output before and after,and then tests the overall performance improvement after BQSR acceleration.The test result is as follows: Under the premise that the output result is 100% consistent with the original program,the BQSR running acceleration scheme proposed in this paper can accelerate 3.91 times,4.04 times and 4.72 times respectively on the TS,WES and WGS data sets.
Keywords/Search Tags:Precision medicine, Genetic data analysis and processing, Base quality score recalibration, Software acceleration
PDF Full Text Request
Related items