Research And Design Of Memory Controller For Convolutional Neural Network Accelerator

Posted on:2022-12-15

Degree:Master

Type:Thesis

Country:China

Candidate:C J Li

Full Text:PDF

GTID:2568307049466324

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

In recent years,Convolutional Neural Networks(CNN)has achieved great success in artificial intelligence and other fields.With the continuous expansion of neural network scale and computational complexity,FPGA accelerators have received extensive attention in accelerating large-scale CNN models due to their advantages of low power consumption,high energy efficiency,and configurability.Due to the "Memory Wall" problem,the memory performance of the accelerator has become a key factor restricting the overall performance improvement.The memory access pattern of the CNN accelerator is complicated,and the memory access performance is relatively poor,which limits the overall performance of the hardware accelerator.This paper designs a memory controller(NNAMC)applied to the CNN accelerator to improve the memory access performance of the accelerator.Its design and research mainly include the following two aspects:(1)The design is based on the address of CNN accelerator,pixel transaction method and memory access pattern.When the CNN accelerator is operating,the memory access stream is to access the memory system according to a specific memory access pattern and address mapping strategy.This paper studies the relationship between image pixels and physical addresses,proposes the address and pixel transaction method(APT),and introduces APT and defines its rules.At the same time,the memory access patterns of CNN accelerators are proposed,and the test benchmarks of dedicated CNN accelerator memory access patterns are designed to test and analyze the memory access performance of the test benchmarks on different address mapping strategies.Finally,according to the test results,through the optimization of bank-level parallelism,this paper propose a strategy of addressing mapping which is suitable for NNAMC.(2)The design is applied to the special memory controller of CNN accelerator.Basing on the special hardware platform,this paper designs the architecture of the dedicated memory controller NNAMC,and accesses the monitoring module through the address stream to reduce the overhead of tracking the visual address stream.At the same time,the steam access prediction unit(SAPU)is designed to predict different patterns of memory access streams of the CNN accelerator.The bank partition model(BMP)is used to optimize the address mapping,and then the bank retag operation is used to enable the address sequence to access the target Bank.Finally,read and write operations are completed through the hardware control system,and the Strict memory scheduling method is used for memory scheduling,so as to realize the control of variables that affect memory access performance.This paper designs a corresponding test plan and a special test benchmark for the dedicated memory controller NNAMC,which is implemented by the FPGA development system,and use both row buffer hit ratio and system memory accessing latency to evaluate the memory access performance of NNAMC.The test results show that compared with other address mapping strategies,NNAMC can increases the average row buffer hit ratio by 16.38%(the highest increase is 26.17%),and the system access latency is reduced on average by 26.3%(the highest decrease is 37.68%).In addition,NNAMC also exhibits a strong ability to adapt to the complex parameters of neural networks.Furthermore,the resource utilization rate of NNAMC is low,leaving a lot of design space for other types of dedicated accelerators.

Keywords/Search Tags:

Memory Controller, DRAM, Address Mapping, Memory Access Optimization

PDF Full Text Request

Related items

1	Performance Analysis Of Off-Chip Memory Architecture
2	Test Characterization And Optimization Of Resistive Memory And Dynamic Random Access Memory For Embedded Applications
3	Research On Memory Mapping Methods Of Reconfigurable And SIMT Processor System Architectures
4	Modeling Of DDR4 DRAM Access Latency
5	Design And Optimization On Multi-core Shared Memory Controller Based On Network Processor
6	The Optimization Of Memory Controller For High Performance CPU
7	Power-saving method for DRAM/eDRAM and 3D-DRAM exploiting the process variations, temperature changes, device degradation, and memory access workload variations and innovative heterogeneous memory management approach using 3D-DRAM with Quality of Service
8	Trace-Based Modeling And Adapting Of Multi-Channel DDR Memory Controller
9	Modeling and design of high-performance and power-efficient 3D dram architectures
10	Research On Key Techniques Of Compiler-based Memory Access Analysis And Optimization