Font Size: a A A

Research On Coded Distributed Computing Schemes Based On Grouping Design And Replication

Posted on:2024-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:H Y DongFull Text:PDF
GTID:2558307061491994Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Map Reduce framework is the current mainstream distributed computing framework.When processing tasks,the data exchange phase often consumes a lot of time.Therefore,combined with coding technology,people have proposed coded distributed computing that can reduce communication load and task execution time,and a scheme to achieve optimal communication load is given.However,the implementation of this scheme requires a large number of input files and output functions,which will increase the calculation time and make this scheme difficult to apply in practice.Therefore,how to reduce the number of input files and output functions while ensuring a low communication load is a topic worthy of research.This paper proposes a coded distributed computing scheme that can reduce the number of input files and output functions for homogeneous distributed computing systems and heterogeneous distributed computing systems.Among them,the characteristic of the homogeneous distributed computing system is that all computers(nodes)have the same ability to store files,calculate functions and transmit signals;the characteristic of the heterogeneous distributed computing system is that different nodes store files,calculate functions and transmit signals.The capabilities of the signals are not the same or not identical.details as follows:(1)The scheme(Ho-CDC scheme)proposed in this paper for the homogeneous distributed computing system firstly designs a standard group by grouping some nodes,and the nodes in the same standard group are responsible for storing the same input file and calculating the same output function.Secondly,the nodes in the non-standard group copy the distribution scheme of the input files and output functions of the corresponding nodes in the standard group,so as to determine the distribution scheme of the input files and output functions of the entire system.In terms of signal transmission,this scheme adopts a transmission mode in which the nodes in the initially selected standard group multicast coded signals for other nodes in a specific transmission group.Through theoretical analysis and experiments,the Ho-CDC scheme proposed in this paper can greatly reduce the number of input files and output functions,and the ratio of the communication load of the system to the optimal communication load is less than 1.324,the ratio of communication load to the scheme proposed by Woolsey et al.is less than 1.334.(2)This paper extends the coded distributed computing scheme of packet replication described above to heterogeneous distributed computing systems.The scheme of the heterogeneous distributed computing system(He-CDC scheme)in this paper considers that nodes have different storage capabilities and computing capabilities.When the standard group is divided,the difference from the Ho-CDC scheme is that the number of node storage files and computing functions in different standard groups is different.Experimental results show that compared with the scheme of optimal communication load,the scheme proposed in this paper greatly reduces the demand for input files and output functions,and the ratio of the communication load to it is less than 1.295.
Keywords/Search Tags:coded distributed computing, MapReduce, grouping design and Replication, homogeneous distributed computing system, heterogeneous distributed computing system
PDF Full Text Request
Related items