| Target recognition has always been the object of scientific research workers since the emergence of computer vision,from the traditional target recognition method to the current convolution neural network.With the development of thebig data,convolution neural networks have gradually replaced traditional recognition methods.In order to obtain stronger features and achieve higher precision,convolution neural network models are also evolving,and networks have also evolved from 2D to 3D,point cloud neural network is one of the most representative network structures.The structure of the network becomes more and more complicated as it develops,and the computing power requirements for hardware become higher and higher.Although the CPU and GPU used to run the network can meet the needs of some applications,in the scenes of fast recognition,high energy consumption,and complex environment,such as underwater,the CPU and GPU are obviously not applicable.FPGAs are highly parallel,configurable,flexible,and low-power.Running a highly parallel network such as a convolution neural network on an FPGA has broad application prospects.This paper selects a lightweight point cloud neural network(Pointnet)as the research infrastructure,which is a three-dimensional neural network that can be used to directly process point cloud data.By analyzing the time and space complexity of the network,the network is optimized without affecting the training accuracy(O-Pointnet),and the nonlinearity of the network is increased by adding an activation function instead of adding a bias to achieve more fast computing speed while reducing the burden of memory.In addition,this paper accelerates the point cloud neural network through FPGA.Using the computing architecture of "host + FPGA",the FPGA is used to perform hardware acceleration on the forward propagation part of the point cloud neural network,and the host is used to classify the trained features.By analyzing the parallelism of point cloud neural network and the internal resources of FPGA,the hardware structure of forward propagation is designed.The multiplexing and startup of different convolutions is achieved through timing control.The design of this paper is implemented on the Zynq series xc7z045 platform.The whole system adopts partial flow mode,and uses 16 bit fixed point number to calculate a depth map of 640×480.The speed is 0.279 s,which is FPS=3.58 images./s,and the power consumption is only 2.1W.Experiments show that the design speed of this paper is about 8.4 times higher than that of CPU,which is about 6.6 times higher than that of GPU.At the same time,the power consumption is less than the power consumption of CPU and GPU,which further proves that the design of this paper has reached a fast speed.The basic purpose of low power consumption. |