Font Size: a A A

Pipelining And Scheduling In Queries Based On Inference

Posted on:2024-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:L R MengFull Text:PDF
GTID:2568307067993469Subject:Software Engineering
Abstract/Summary:PDF Full Text Request
In recent years,more and more cameras have appeared in people’s lives.The pic-tures,videos and other data provided by these cameras play an important role in accident prevention,criminal investigation and crime solving,which also puts forward higher re-quirements for query processing.Currently,the emergence of big data has given rise to large-scale data processing systems which can expressly handle large,diverse,low-value density,real data and continues to be active in academia and industry.Meanwhile,deep learning techniques have played an important role in the application of picture,audio and video data due to their excellent ability to automatically extract features.Therefore,there are many systems based on deep learning to process data queries,and it is a common ap-proach to fuse large-scale data processing systems and Deep Learning systems.Among them,large-scale data processing systems mostly rely on multi-core CPU(central pro-cessing unit)to improve performance,and neural network models need a large amount of computation to perform one inference due to the large number of layers,which requires the help of GPUs(graphics processing units)with high-speed floating-point computing capability.Therefore,two types of processors,including CPU and GPU,are required to process such queries.In the existing system,the CPU and GPU processors process the data sequentially,which is a blocking execution strategy.In the blocking execution strategy,the serial use of resources leads to low resource utilization and poor query efficiency.To address the above-mentioned problems of low resource utilization and query ef-ficiency in heterogeneous systems for processing queries based on inference,this thesis focuses on the query workload characteristics,shows that the problem can be solved by pipelining techniques in parallelism,and proposes a pipelining execution strategy.How-ever,the introduction of the pipelining execution strategy also brings the typical bubble problem.To reduce the impact of bubbles,this thesis investigates how to eliminate bub-bles through the scheduling mechanism to further accelerate the query efficiency.Finally,this thesis develops a prototype system which integrates the above-mentioned pipelining execution strategy and pipelining scheduling techniques.The main contributions of this thesis around the above study are as follows.We propose a CPU/GPU parallelism-enabled pipelining execution strategy for het- erogeneous systems,which improves the deficiency of serial execution of blocking strategy in terms of resource utilization and query efficiency.The shortcoming of the blocking execution strategy used in existing queries is that CPU and GPU pro-cessors process data serially.Solving this problem implies that queries require par-alleling CPU and GPU processors.Therefore,this thesis draws on pipelining tech-niques to implement a pipelining execution strategy that supports CPU/GPU par-allelism.This strategy exploits the property present in queries based on inference,that is the query processes data in batches,and different operations are required by CPU and GPU processor each time.The experiments in this thesis show that the pipelining execution strategy supporting CPU/GPU parallelism can improve query efficiency up to 150% compared to the default blocking serial execution strategy.We design a pipelining scheduling mechanism that supports data prefetching,which implements the scheduling mechanism under single query with multiple predicate types.The pipelining execution strategy proposed in the previous contribution can achieve high performance improvement,but the existence of ”bubbles” in multi-predicate inference-oriented lookups leaves room for improvement.In order to solve this problem,this thesis implements a pipelining scheduling mechanism that supports data prefetching.This strategy exploits the property present in queries based on inference,that is the tasks of CPU and GPU processing required by each batch of queries have a specific sequence,but there is no direct connection be-tween batches of data.The experiments in this thesis show that the implementation of the pipelining scheduling mechanism supporting data prefetching can improve the query efficiency up to 34.9% compared to the implementation using only the pipelining scheduling policy.We implement a prototype system that adopts the above-mentioned pipelining execu-tion strategy and pipelining scheduling mechanism.This thesis based on the above research content,develops a prototype system based on Spark,a large-scale data processing system with rich software stack,and Tensor RT,a deep learning system with significant accelerated inference effect.The system is also used to verify the effectiveness of the pipelining execution strategy supporting CPU/GPU parallelism and the pipelining scheduling mechanism supporting data prefetching.In summary,this thesis focuses on the problem of low resource utilization and poor query efficiency caused by using the blocking execution strategy for processing queries based on inference in existing heterogeneous systems.To address this problem,this the-sis proposes a pipelining execution strategy and further optimizes the scheduling of this strategy to finally implement a prototype system.Theoretical analysis and experimental results show that the research work in this thesis breaks the status quo of blocking exe-cution and improve the efficiency of execution with limited computing resources.It also provides implications for the application of Deep Learning based inference techniques in the field of data query.
Keywords/Search Tags:Query Based On Inference, Pipelining Execution Strategy, Pipelining Scheduling Mechanism
PDF Full Text Request
Related items