Font Size: a A A

VLIW processors: Efficiently exploiting instruction level parallelism

Posted on:2001-04-12Degree:Ph.DType:Dissertation
University:Stanford UniversityCandidate:Rudd, Kevin WilliamFull Text:PDF
GTID:1468390014455766Subject:Electrical engineering
Abstract/Summary:
This dissertation explores high-performance complexity-efficient processors focusing on VLIW processors. Complexity efficiency is a qualitative characteristic that describes a system where performance has not reached the point of diminishing returns. Using the techniques described in this dissertation, simple statically-scheduled very-long-instruction-word (VLIW) processors can be efficient architectures for exploiting instruction-level parallelism and can effectively address the needs of general purpose computing.;We studied the ability of dynamic execution to exploit instruction-level parallelism in dynamic VLIW processors. Unlike previous studies, this study explores the benefits of dynamic execution on an instruction stream with explicit instruction-level parallelism. Dynamic execution is thus applied to problems that compilers have difficulty solving rather than to those problems that compilers readily solve reducing the need for complex and costly hardware. In addition to presenting performance results, we also describe a general processor model and execution definition that improves upon the precise execution model used in traditional processors; we also describe the simulator that implements this new execution model. In our simulations we varied a number of parameters allowing extraction of the individual effects of each parameter on performance. These simulation results show that although a small amount of reordering is adequate to eliminate almost all penalties associated with scheduling errors and latency variations, even a significant amount of reordering is inadequate to eliminate the penalty associated with branch mispredictions, and long memory latencies.;As an alternative to dynamic VLIW processors, we developed Replay Buffers to extend static VLIW processors to support efficient multi-threading. Replay Buffers provide zero switch-cycle thread switches as well as overhead-free exception handling (beyond the cost of the exception handler) and reasonable latency tolerance for delays. Replay Buffers allow VLIW processors to meet the needs of general-purpose applications without the complexity of dynamic VLIW. In addition to improving the capabilities and performance of VLIW processors, this technique has applications beyond VLIW processors and can also benefit all processors and systems using pipelines, particularly those using wave pipelining.
Keywords/Search Tags:VLIW processors, Parallelism, Problems that compilers, Performance
Related items