Font Size: a A A

Research On Cloud Computing Task Scheduling For Remote Sensing Big Data Applications

Posted on:2019-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:X L YinFull Text:PDF
GTID:2432330551460867Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Remote sensing analyzes the radiation and reflection information by non-contact and long-range detection technology,which is an important way to monitor and acquire the earth's resources.With the development of optical technology,radio-electronic technology and computer science technology,remote sensing can not only get high spectral resolution but also can retain high spatial resolution,so the types of remote sensing data are more and more complex,and the amount of remote sensing data is increasing.Therefore the remote sensing data have obvious characteristics of big data.The computational and storage bottlenecks need to be solved when we process remote sensing images by traditional stand-alone method.Cloud computing has the characteristics of parallel computing,high scalability and high fault tolerance,which could solve the bottleneck of single computing and bring new solutions for processing remote sensing image.This paper includes the following aspects,including the storage mechanism and fault tolerance mechanism of HDFS,the MapReduce programming model.On the basis of above,a distributed parallel method of fusion classification for hyperspectral image is designed and implemented.In order to minimize the makespan,we use directed acyclic graph to describe the tasks in the parallel optimization method,and construct a minimized makespan scheduling model under resource constraints.Aiming at this model,we propose a task scheduling method based on hybrid quantum evolution algorithm,which modifies tasks execution sequence and expands the search scope of the optimal solution.The experimental results show that the proposed parallel method and task scheduling method can not only handle large-scale remote sensing data but also improve the classification efficiency under the premise of ensuring classification accuracy.1.In this paper,a distributed parallel optimization method for hyperspectral image fusion classification based on Spark is proposed.Firstly,the features are distributed extracted and fused based on subpixel,pixel and superpixel,then train the parameters and models of svm classifier by the training set,finally,we use svm classifier to classify pixels of hyperspectral image.In this paper,we reduce the generation of intermediate data and the consumption time of Java memory management by designing the running logic of the parallel algorithm reasonably.We reduce shuffle time and data transmission time by designing reasonable data partition,which can reduce the correlation between data partitions.The classification accuracy is ensured by data repartitioning.The results show that this method not only has the ability of processing big data,but also improves the processing performance and has a good speedup.2.A scheduling model for minimizing makespan under resource constraints is proposed.Firstly,according to the processing flow of parallel algorithm and data transmission between tasks,we determine the partial order relation between tasks,and abstract the distributed optimization task into DAG's nodes reasonably.Finally,according to the constraints of the start time,the priority of task execution and the number of computing resources,a resource constrained makespan optimization scheduling model is constructed based on DAG,which optimization objective is the finish time of task completion.3.A task scheduling method based on hybrid quantum evolutionary algorithm is proposed.Since the optimization model is an NP-complete problem,it is difficult to find the best solution,so we use heuristic algorithm to solve the problem.Based on the quantum evolutionary algorithm and DAG,this paper proposes a task scheduling algorithm for processing remote sensing big data,which allocates appropriate computing resources for tasks and minimizes the total running time.In order to avoid getting into local optimum,this paper also designs a single-objective optimization algorithm based on hybrid quantum evolution.This method improves the searching ability and expands the search range of the optimal solution by modifying the submission sequence of task.The experimental results show that the hybrid quantum evolutionary scheduling algorithm can significantly reduce the total runtime.
Keywords/Search Tags:Hyperspectral Images, Distributed Parallel, Fusion Classification, Spark, DAG, Task Scheduling
PDF Full Text Request
Related items