Data processing is an important part of scientific research tasks,and many tools and methods for data processing have been summarized by predecessors in the historical process of scientifie development.With the advent of the information age,the amount of data that scientific research needs to process has increased dramatically,and traditional serial processing methods are often inefficient in the face of massive data.In recent years,General Purpose Graphies Processing Units(GPGPUs)based on the Single Instruction Multiple Dat a(SIMD)architecture have emerged as a new upgrade to traditional graphics processing units.They can run scientific comput.ing tasks with data parallelism at high throughput and have broad prospects for scientific computing applications.However,many data processing tasks encounter some problems when directly applied to GPGPUs.1.The data processing process itself is highly interdependent,and its topology has almost no parallelism:2.The existing algorithms are generally implemented based on a general-purpose central processing unit(Central Processing Unit,CPU for short),and the variable design and execution logic are implemented in the optimal programming method of the CPU.which cannot adapt to the hardware architecture of the GPU:3.A few parallelization algorithms implemented by CPU,whose parallel ability conies from the multi-core of CPU.is a kind of parallel method at the coarse-grained level.while the powerful data processing ability of GPU needs to be displayed through parallel at the fine-grained level;4.Mature commerecial software has extremely meticulous optimization of algorithms and can make full use of the computing power of the CPU,thereby achieving high operating efficiency on the CPU.However,these commercial software are closed source and cannot meet the requirements of scientific research.The need for specific algorithm modification.To answer these questions.this article will illustrate two scientific computing scenarios.This thesis presents the parallel algorithm application of GPU after linear data approximation processing,and the parallel application of 3D data in the 3D reconstruction algorithm of cryoelectron microscope.These two methods mbody the methods of changing algorithm topology to increase parallelism and making full use of existing parallolism to accelerate optimization in engineering.The innovations of this thesis include:proposing a segmented approximation parallel DTW algorithm for linear data processed by DTW,which decomposes the data into multiple segments and enables highly parallel computation at a coarse granularity level as well as diagonal parallel computation at a fine granularity level,thus allowing GPUs to achieve massive throughput while ensuring accuracy;leveraging the data independence in cryo-electron microscopy three-dimensional reconstruction to enable each CUDA logical thread to correspond to a data point,achieving multi-granularity parallelism at multiple levels,and proposing a modified memory-saving algorithm for cases with limited GPU memory,which fully utilizes the parallel computing capability of GPUs. |