Research On Global Sequence Alignment For Intel Multi-core And Many-core Platforms

Posted on:2019-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:J K Zhang

Full Text:PDF

GTID:2370330545453692

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of DNA sequencing technology,the number of gene se-quences has been rapidly increased.In order to use these sequence data effectively,we often need to align them with known genomes to obtain similarity and homology information of these sequences,thus laying the foundation for further analysis.Traditional sequence alignment algorithms are very ineffective when dealing with massive sequences due to the limitation of their complexity.In recent years,the hardware and software technologies have a great development,especially the emergence of many-core architecture,high-performance computing has played an more and more important role in many areas such as computational biology,artificial intelligence and natural language processing.Applying high-performance computing to sequence alignment can significantly improve the speed of comparisons and improve the efficiency of sequence analysis.In this paper,we focus on the global sequence alignment problem and utilize the high-speed computing capabilities of Intel's multi-core and many-core platforms to accelerate the alignment procedure which has obtained an excellent effect.Needleman-Wunsch algorithm is the commonly used global sequence alignment algorithm.Based on this algorithm,two alignment algorithms using bit-parallel optimization are derived:Myers and BitPAl,and they have made some curtail-ment in functionality to obtain higher performance.We mainly optimized the above algorithms from two dimensions:thread parallelism and SIMD parallelism.Thread parallelism mainly uses multi-threading technology,and we first divide sequence data into multiple data blocks,then each thread will process a block of data in parallel.Furthermore,we use SIMD instructions,such as SSE,AVX2,KNC and AVX512,to perform more finer-grained vectorization parallelism inside threads.In order to improve the scalability of the system,we have designed and implemented a modular parallel framework.The functions in our system have been refined and divided into several independent functional modules.We have ed out a computation module to deal with the alignment logic,other modules only need to transfer data to this module,and then obtain the result without caring about the implement ation details of computation module.If we want to add a new alignment algorithm,we just need to reimplement the computation module and can reuse other modules function.At the same time,we have designed the virtual SIMD instructions and implemented a corresponding instruction interpreter to solve the problem of inconsistent SIMD instruction sets.By using the virtual SIMD instructions,we only need to write one uniform code,and then can trans-late it for different SIMD instructions.We have tested our parallel algorithms on different platforms,experiments show that our parallel algorithms have achieved an excellent acceleration speed up.Moreover,we also compare our algorithm with other parallel implementations and our implementation has achieved more higher performance.

Keywords/Search Tags:

High performance computing, sequence alignment, SIMD, heterogeneous computing, parallel framework

PDF Full Text Request

Related items

1	The Parallel Computing Research On High-performance Spatial Analysis Under Cpu/Gpu Heterogeneous Environment
2	Research On OpenMP 4.0 Based Heterogeneous Parallel Computing Techniques For CFD Applications
3	Study Of Fast Gene Sequence Alignment Method Based On Parallel Computing
4	Parallel DNA Read Mapping Algorithms On Multi-core And Many-core Architecture
5	High-performance Biological Sequence Processing Framework For NGS Data
6	Research And Implementation Of High-Throughput Sequencing Alignment Method Based On Distributed Computing
7	High Performance Method Of Moments And Its Application In Electromagnetic Simulation Of Complex Targets
8	Large Scale Ultra-long Biological Sequence Clustering
9	Researches On Parallel Algorithms For DNA Sequence Alignment
10	Research On The Key Technology Of HopeFOAM:A Discontinuous Finite Element Based High Order Parallel Computing Framework