Font Size: a A A

Design And Implementation Of MapReduce Programming Framework For Small Files Set For Agricultural Informatization Processing

Posted on:2018-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:H Q LiuFull Text:PDF
GTID:2393330590475727Subject:Agricultural extension
Abstract/Summary:PDF Full Text Request
In the process of agricultural Informatization,big data technology is widely used to explore resources agricultural resources,to optimize the configuration design,to capture consumer demand and track market changes.In order to explore the potential information behind the massive data mining analysis,scientific data analysis on the calculation and data-intensive is more and more popular.MapReduce-Hadoop has emerged as an effective framework for large-scale data analytics.It can calculate the large-scale data sets easily by using the distributed file system.But the performance of existed MapReduce framework is far from ideal when it faces the small-scale data set.Because it distributes the data blocks to the nodes in cluster evenly,reads and writes the files on the distributed file system.In view of this problem,this paper investigates a type of MapReduce programming abstraction built on Comet.The prototype system can avoid the bottleneck caused by data storage effectively,keep the balance between the processing speed and cost,and enhance the availability and scalability.The main work of this paper is summarized as follows:(1)It analyzes the flow for the job processing of the Hadoop MapReduce programming abstraction and addresses the shortcomings of usability and extensibility when it deals with small files set.And then,a solution of MapReduce programming abstraction built over Comet framework is provided,which supports the Master-Worker(BOT)parallel asynchronous calculation.(2)To study the problems for achieving the Comet MapReduce programming prototype system,this paper focuses on the workflow of the system to deal with the task,the design of the key interfaces/classes,and the users-objective driven self-scheduling.Then it analyzes the similarities and differences about API between our system and the Hadoop MapReduce,so as to transplant the Hadoop MapReduce's features in future.(3)Finally,with mining PDB for distance information application it verifies the performance of the Comet MapReduce programming framework system in running time,memory consumption and load balancing.And the performance is also compared with the Hadoop MapReduce system.
Keywords/Search Tags:Comet Framework, MapReduce Programming Abstraction, Master -Worker(BOT), Users-Objective Driven Scheduling
PDF Full Text Request
Related items