| As the era of big data brings massive amounts of digital information,the resolution of remote sensing images becomes high,which increases the requirements for target detection algorithms.Due to the outstanding feature characterization and generalization capabilities,deep learning algorithms have been widely applied in the field of target detection.However,the computational complexity of large-scale deep learning algorithms is too high to run with CPU.Furthermore,the acceleration platform GPU has the problems of large geometric size and high power consumption.Therefore,it is hard to apply deep learning algorithms on satellites.In recent years,FPGAs,owning the features of abundant logic resources,high parallelism and low power consumption,attract attention in the research of acceleration of deep learning.The top priority of deploying deep learning algorithms on FPGA is to solve the problems of limited storage resources and weak computation on floating-point data.To deploy and accelerate the target detection of remote sensing images built with CNN on FPGA,a hardware and software co-acceleration system based on FPGA is proposed to accelerate the inference of deep learning algorithms in this thesis.Firstly,Tensor Flow framework is used to build YOLOv3.The pre-training weights on VOC dataset is applied to initialize the algorithm with transfer learning,and GIo U is applied as the loss function to evaluate the detection performance after training.Secondly,based on the Xilinx’s Vitis AI scheme,the trained weights are processed through freezing,calibrating,quantizing and compiling in order to greatly compress the size.Thirdly,the dedicated AI acceleration unit DPU is placed on FPGA and scheduled by the main APP to execute the acceleration of inference.Finally,based on the development method of Zynq So C,an FPGA hardware project is built with DPU as the core,and an APP is programmed on ARM to execute DPU task scheduling,image preprocessing and Softmax classification.The boot image of Zynq So C is packaged and the inference acceleration is implemented on co-acceleration system at last.In this thesis,YOLOv3 is built and trained on DIOR dataset in Ubuntu 18.04 system,and Vitis AI toolkit is applied to quantize and compile the trained weights into DPU instruction stream.The hardware and software co-acceleration system is built on ZCU104 evaluation board,where the deployment and acceleration of YOLOv3 are executed through the APP on ARM scheduling the DPU on FPGA.The detection rate,detection accuracy and power consumption of the deployed YOLOv3 are compared with the inference performance of CPU and GPU.The experimental results show that our system obtains a m AP score of 56.72 and a detection rate of 26.8 fps.Compared with CPU and GPU,although the detection accuracy loses about 2% compared with the model before deployment,the power consumption is only13.5 W and much smaller than that of CPU and GPU.Besides,our efficiency ratio is 20 times of Intel i7-8700,4.7 times of RTX 2080 and 6.2 times of GTX 1060,and also better than other related papers,which indicates the feasibility of our system and performs suitable to deploy on satellites. |