Font Size: a A A

Design And Implementation Of Ehanced Parallel Computing Framework System In Cloud

Posted on:2014-02-05Degree:MasterType:Thesis
Country:ChinaCandidate:H K TuFull Text:PDF
GTID:2248330398972374Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud Computing is one of the hot topics in the field of information technology industry, and attracts lots of attentions from the industry, the academia and the government. As a new type of computing model through Internet, Cloud Computing aims to provide secure, fast, convenient data storage and computing services to users with the open standards and services as the the foundation, the Internet as the center. It is the combination of traditional computer technology and network tenology, such as Parallel Computing, Distributed Computing, Grid Computing, Utility Computing, Network Storage Technologies, Virtualization, and Load Balance, etc. In another way, Cloud Computing is the business implementation of these computing science concepts above. While Cloud Computing gets wide attentions, Parallel Computing, the development foundation of Cloud Computing, deserves more care. The research on Parallel Computing will be more meaningful.Parallel Computing refers to solving the computing problems with the use of a variety of computing resources at the same time. Since the birth of Parallel Computing, solutions are mostly stayed on the hardware level. It’s not until Google provided the MapReduce parallel programming model that Parallel Computing is widely applied in commercial scenarios. Hadoop, the open-source implementation of MapReduce, have been successfully applied to different scenarios like log analysis, parallel computing, ETL, machine learning, etc. However, knowing Hadoop to being good at Hadoop is a giant gap. To achive this goal, users have to make a good understanding of the Hadoop cluster configuration, Hadoop programming API, the operation on Hadoop tasks. This undoubtedly makes it hard for the beginners of MapReduce.This paper presents the Enhanced Parallel Computing Framework System in Cloud (EPCFS), which aims at providing the MapReduce computing capabilities to users as cloud services. EPCFS can be applied in nowadays popular cloud environment. It makes full use of the high scalability, high reliability and high flexibility of cloud computing, and provides an auto-configured Hadoop cluster according to the requirements of users. What’s more important, it came up with a set of annotations to make it easier for the beginners to program with. The beginners didn’t have to learn the Hadoop API and write a lot of glue codes between service logic and Hadoop API. With the annotations, users easily write codes and upload to the EPCFS, and all the rest work will be done by the system, which is fast, efficient and flexible.This paper firstly introduces the Cloud Computing, Parallel Computing, the background and significance of EPCFS, then presents the key technologies used in EPCFS, and the comparison between several solutions. Next, the design and implementation of enhanced parallel computing framework system will be introduced. The implementation will be based on the lab-owned IaaS Platform. During this part, the paper will elaborate the interaction among those function modules in EPCFS. Finally, the last chapter will show the function test and performace data analysis, and make a conclusion about EPCFS and where it can be improved.
Keywords/Search Tags:Cloud Computing, Parallel Computing, HadoopMapReduce, Annotations
PDF Full Text Request
Related items