Font Size: a A A

Parallel Optimization And Design Of Population Counting Algorithm For GPGPU On Embedded Platform

Posted on:2018-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhouFull Text:PDF
GTID:2321330515451612Subject:Software engineering
Abstract/Summary:PDF Full Text Request
At present,the rapid growth of the domestic urban population has greatly increased the probability of gathering events in public places.The abnormal group events caused by the crowd gather such as stampede and chaotic bring a huge loss of life and property to people.How to effectively monitor and manage the crowd dynamic information of the subway,mall and square and other public places has become the current practical problems to solve.The crowd number information is the main feature of the abnormal group event.It will help managers to dredge the gathered crowd in time to avoid the occurrence of abnormal group events.The rapid increase of GPU hardware performance has made the use of GPU for general computing become a new way to accelerate the digital image algorithm in recent years,.For the need of early warning of the crowd abnormal event,the thesis proposes a population counting algorithm for monitoring video and accelerates the algorithmic bottleneck module by GPGPU general calculation technology.Firstly,according to the characteristics of monitoring the video in public places such as square and channel,the thesis designed and implemented a population counting algorithm by the technology of foreground extraction,edge detection,target recognition and tracking and analysed the time-consuming of the algorithm.The bottleneck modules is ViBe foreground extraction and Canny edge detection.Then,the thesis used the OpenCL heterogeneous development framework to design the parallel optimization strategy for ViBe foreground and Canny edge detection.In the parallel optimization designs for ViBe foreground,the thesis used NDRange index space optimization and asynchronous execution optimization to accelerate model initialization and model updates by GPU hardware.In the Canny edge detection parallel optimization design,the thesis separately used memory access optimization,separate convolution design,reduce memory access times and limited iterative processing to parallel optimize the image high-speed filtering,gradient values and direction calculations,non-maximum suppression and double threshold edge connection.Test the performance of the ViBe algorithm and Canny algorithm before and after optimization.The results show that algorithm has been optimized can efficiently reduce the running time without affecting the algorithm processing effect.Finally,the thesis applyed the population counting algorithm which has been parallel optimized to the monitoring system and implemented and tested the system on the embedded platform.Through the overall functional comparison and performance test of the monitoring system,results show that the system after parallel optimization design greatly improves the running efficiency of the high time-consuming bottleneck module.The efficiency of the system which accelerated by the GPU hardware can be increaced by 45% to 60% without affecting the system operation and monitoring effect.
Keywords/Search Tags:Crowd Counting Algorithm, Open Computing Language, General Calculation, Embedded Platform
PDF Full Text Request
Related items