| Image processing technology has been widely used in many fields such as engineering,remote sensing,medicine,meteorology,and military,among which a large number of matrix operations are used.As the most complex calculation in basic matrix operations,matrix inversion operations limit the processing capabilities in image recognition and related application circuit fields.Efficient algorithms such as Sherman-Morison formula,accelerated floating-point multiplication,and addition alleviate the intensive computation of matrix inversion,but limit the overall speed of operations due to the timeconsuming multiple division operations.In order to improve the speed of division operations and solve the bottleneck problem of scene operations such as image processing,nowadays,using GPU or FPGA for image processing solves the problem of slow CPU operations and inability to parallel,but its own power consumption is too large.In contrast,building SoC based on a microprocessor core and adding hardware accelerators have the advantages of low cost and low power consumption.In view of this,this article proposes a SoC based on the Cortex-M3 kernel and integrates a single precision Goldschmidt divider,which is applied in fields such as image processing to effectively improve the matrix inversion speed in image recognition.The main work of this article is as follows:(1)By analyzing the floating-point format and the operation methods of division in floating-point format,as well as verifying and analyzing the Goldschmidt algorithm,a hardware divider based on Goldschmidt algorithm and bipartite reciprocal tables method is proposed,and a carry ahead adder is designed,the Wallace tree multiplier is used as the iteration unit of the hardware divider,and bipartite reciprocal tables is designed to reduce the iteration cycle of the division,Finally,the hardware divider Verilog code is used for logic simulation in Modelsim software,while DC software is used for logic synthesis analysis of area and power consumption.Finally,synthesis is performed on Xilinx FPGA platform.(2)The Cortex-M3 processor core of the ARMv7 instruction set has the advantages of open source,low power consumption,and low cost.It uses Cortex-M3 as the core of the SoC,and connects RAM,ROM,AHB peripherals,and APB peripherals through AHB bus,APB bus,AHB2 AHB bridge,and AHB2 APB bridge to build the overall SoC.Add three UART,one 32 bit GPIO,three PWM,and three SPI peripherals under the APB bus to achieve the basic communication functions of the SoC.Add a 32 bit unsigned Wallace tree multiplier and a single precision floating-point Goldschmidt divider under the AHB bus,and write corresponding drivers to enable SoC to be used in fields such as image processing.(3)The prototype verification of SoC was carried out using Intel FPGA,and based on the 180 nm process library,logic synthesis was carried out under DC software,formal verification was carried out under Formality software,layout design was carried out under ICC software,and finally the SoC chip layout was generated. |