Font Size: a A A

Algorithms for compiler-assisted design space exploration of clustered VLIW ASIP datapaths

Posted on:2002-01-06Degree:Ph.DType:Dissertation
University:The University of Texas at AustinCandidate:Lapinskii, ViktorFull Text:PDF
GTID:1462390011997140Subject:Engineering
Abstract/Summary:
Clustered Very Large Instruction Word Application-Specific Instruction Set Processors (VLIW ASIPs) combined with effective compilation techniques enable aggressive exploitation of the instruction level parallelism inherent in many embedded media applications, while unlocking a variety of possible performance/cost tradeoffs. In this dissertation we propose and validate an algorithm to support early design space exploration (DSE) over classes of datapaths, in the context of a specific target application, and carry out an empirical study for a set of representative benchmarks. We argue that at an early DSE phase one should use design space parameters that have a first-order impact on two key physical figures of merit: clock rate f and power dissipation P. We found these parameters to be: maximum cluster capacity (number of functional units in a cluster) NF, number of clusters NC, and the interconnect capacity NB.; The experimental validation of our DSE algorithm shows that a thorough exploration of the complex design space can be performed very efficiently in this parameterized design space. Moreover, our case studies suggest that “penalties” of clustered versus non-clustered datapaths are often minimal and that clustering indeed unlocks a variety of valuable design alternatives.; Our exploration methodology is enabled by an efficient algorithm for binding operations in a dataflow graph to the clusters of a datapath, so as to minimize latency and the number of data transfers. The algorithm utilizes effective cost and ranking functions that enable the exploration of complex tradeoffs between: (1) operation serialization, due to cluster overload; and (2) penalties incurred by data transfers, due to scattering operations with data dependencies over different clusters. The core binding algorithm has shown robustness over a large set of datapaths and application kernels, and demonstrated up to 29% improvement in schedule latency, as compared to a state of the art advanced binding algorithm.; Although some of the individual algorithms, developed in the context of this dissertation are also of direct interest to retargetable compilers, our main contribution is design space exploration of clustered datapaths. Such an exploration is an essential phase in the overall methodology to support specialization/exploration/design of application-specific processors and associated memory subsystems.
Keywords/Search Tags:Exploration, Design space, Cluster, Algorithm, Datapaths
Related items