Font Size: a A A

Heterogeneous architecture for Big Data analytics

Posted on:2017-12-29Degree:Ph.DType:Dissertation
University:University of Massachusetts LowellCandidate:Li, PeilongFull Text:PDF
GTID:1468390014953146Subject:Computer Engineering
Abstract/Summary:
Compute-intensive workloads such as analytic and scientific algorithms impose new challenges on traditional High Performance Computing (HPC) and Big Data platforms. On one hand, HPC platform integrates heterogeneous computing resources to gain better performance and power efficiency. However, heterogeneous architectures face challenges regarding transparent acceleration as well as the allocation of resources to cores and accelerators. On the other hands, popular Big Data platforms such as MapReduce and Apache Spark address big data challenges with data and computation distribution and in-memory caching. However, these CPU-only frameworks still struggle to meet the increasing analytics speed requirements. Besides, because of the dynamics and varieties of workloads in HPC and Big Data platforms, it's challenging to predict the incoming applications to schedule decision to heterogeneous resources.;In this dissertation, we explore the use of heterogeneous architecture for both micro-level HPC System-on-Chip (SoC) and macro-level Big Data clusters. We first propose the "Transformer", a run-time reprogrammable, heterogeneous SoC architecture consisting of cores and reconfigurable logic with support for coarse-grained acceleration of the dynamic, unpredictable workloads. The architecture allows for the run-time instantiation of one or more acceleration functions, and thus achieve higher performance and energy efficiency. Then we present "HeteroSpark" , a GPU-a.ccelerated heterogeneous architecture integrated with Apache Spark, which combines the massive compute power of GPUs and scalability of CPUs and system memory resources for applications that are both data and compute intensive. Lastly, we implement a generic heterogeneous load balancer "Sparkling+" , which employs low overhead dynamic profiling techniques to adaptively schedule tasks to heterogeneous computing resources for optimal performance.
Keywords/Search Tags:Big data, Heterogeneous, HPC, Performance, Computing, Resources
Related items