Font Size: a A A

Architecture Design And Optimization Of Domain-specific Heterogeneous Multi-core System-on-Chip

Posted on:2012-11-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:M YanFull Text:PDF
GTID:1118330341951623Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
The multicore era of embedded processor has come with the development ofsemiconductor manufacturing technology. The number of applications on embeddedsystems is increasing rapidly. The goal of embedded system architecture design isshifted from micro-architecture centered to target application centered. And the designmode is changed from applications adapt to system architecture to system architecturesuits applications. The main topic of embedded architecture research in multicore era ishow to design high efficient and high available architecture for certain applications.Coarse-grained reconfigurable architecture (CGRA) and application specificinstructionset processor (ASIP) are the mainstream two kinds of high efficient domainapplications specific architectures. The work presented in this thesis is focused ondomain applications specific architecture design and optimization. Design andimplementation of these two kinds of architectures and design optimization method arepresentedinthisthesis.Themaincontributionofthisthesisislistedasfollows:1. On reconfigurable architecture design: after a comparison and analysis of theabstract models of several mainstream architectures, a new novel architecture isproposed named as programmable dataflow architecture (ProDFA). A blockcipherspecific reconfigurable SoC based on ProDFA is designed. With the support offine-grained long-term control logic, ProDFA could combine flexible programmabilityand high efficient dataflow computing. Through analysis the computing characteristicsof several blockciphers, a blockcipher specific reconfigurable SoC with fourreconfigurableprocessingunits andonereconfigurablecontrol unit based onProDFAisimplemented.TheapplicationmappingofthisreconfigurableSoCisaidedbyautomaticsynchronization tools with support of configuration description language (CDL).Compared with several algorithm specific FPGA design, the reconfigurable structureshowshighperformanceandhardwareefficiency.2. On domain specific design method: a fast candidate subgraph generate methodfor function unit design of reconfigurable architecture is proposed. For the first time, atop-downmaximum validsubgraphenumeration (MVSE)algorithm is combinedwithatopological search algorithm, which improves the process of enumerating andidentifying candidate subgraphs from a given dataflow graph. During the MVSEprocess, a clustering algorithm of invalid nodes is used to decrease the number ofinvalid nodes which decides the complexityof MVSE algorithm. The top-down MVSEalgorithm uses the invalid node elimination method which splits the original dataflowgraph into maximum valid subgraphs. A heuristic searching and identifying process isapplied after the MVSE to get all candidate subgraphs which could be accelerated byhardware. And a grouping process of candidate subgraphs is then used to generate candidate subgraph groups. Subgraphs in the same group could be supported by afunction unit efficiently. Experiment results showed that the performance of MVSE isimproved in most situations compared to bottom-up MVSE algorithms and thecandidatesubgraphgroupsareeffectiveandpractical.3. On ASIP architecture design: a heterogeneous multicore SoC for embeddedvisual media processing (EVMPSoC) is designed and implemented. EVMPSoCconcludes one high performance embedded processor and two SIMD applicationspecific processors. The two application specific processors share a common basicinstruction set, each extended with different application specific instructions. The twocoprocessor cores are tightly coupled with a display driver unit and a wide externalmemory interface through a multi-chinnal communication module. Experiments onmicro-kernels shows the high accelerate ratio of EVMPSoC compared to generalembedded processor. The SoC is implemented using 0.13um CMOS standard celltechnology. A typical algorithm of visual media application is tested on the SoC toevaluatetheperformanceandefficiency.4. On performance optimization: a high level loop optimization method based onpolyhedral transformation is proposed to exploit the parallelism on EVMPSoC. Themaximized parallization and minimized communications of a given affine loop nest isgained through polyhedral transformation. There different level of parallism ofEVMPSoC is exploited based on this technique, such as SIMD data level parallism,multicore thread level parallism and memory pipeline parallism. Experiments on threetypical application algorithms show the great improvement of the performance after theoptimization.
Keywords/Search Tags:system architecture, heterogeneous multicore, domain specificdesign, design automation, performance optimization
PDF Full Text Request
Related items