Font Size: a A A

Research On Distributed Parallel Operation Method Of Terrestrial Carbon Cycle Model

Posted on:2020-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:B W ZhangFull Text:PDF
GTID:2370330578475048Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
With the increasing global warming and greenhouse effect in recent years,exploring the mechanism of climate change and predicting the trend of climate change has become the focus of academic research.The terrestrial carbon cycle model uses the terrestrial ecosystem to simulate and calculate the carbon cycle process and plays an irreplaceable role in exploring the carbon cycle mechanism between the layers and predicting future climate change.However,with the continuous development of the terrestrial carbon cycle model,the calculation process of the model is more and more complex,and the data resolution associated with the model is more and more fine.The traditional single-machine operation mode is difficult to meet the high computational power required by the model.In order to solve the problem of model operation,the current academic circles generally adopt the internal hard coding method of the model and the MPI parallel standard method to realize the parallel calculation of the model.However,the parallelization of these models relies on a large amount of parallelization work,and the implementation difficulty and complexity are high.It is difficult to form a esay and configurable model parallel operation solution.In this paper,the parallel operation of the terrestrial carbon cycle model is taken as the research goal.Based on the two key elements of the parallel operation of the model:the requirements of model parallel computing and model cluster deployment,the terrestrial carbon cycle is designed based on the parallel technology in the distributed network environment.Model distributed parallel computing method and cluster deployment method.In the aspect of distributed parallel computing,the distributed data storage scheme,and distributed parallel computing scheme of the model are designed for the data storage and parallel computing requirements of the model.In the aspect of distributed deployment,it is difficult to deploy the model and the inter-cluster communication mode.Differently,the virtualized deployment scheme of the model parallel cluster and the communication scheme between the model parallel clusters are designed.The research contents and main results of this paper are as follows:(1)Distributed parallel computing method for terrestrial carbon cycle model.Aiming at two key difficulties in model data storage and model parallel computing in model distributed computing,a distributed data storage scheme and model distributed parallel computing scheme are designed.In terms of data storage,a standardized scheme of multi-source data is designed based on model data characteristics.A distributed storage strategy of model data is designed based on HDFS distributed storage framework,and a corresponding optimization scheme is proposed to improve the access efficiency of model data.In terms of computation,the model-based computational feature proposes a parallel-oriented model computing task segmentation method.Based on the Spark distributed parallel framework,the distributed parallel computing strategy of the model is designed,and the parallel performance and resource scheduling strategy of the model is optimized.(2)A distributed deployment method for the terrestrial carbon cycle model.Aiming at the problem that the model environment and the cluster parallel environment are complex,the deployment is difficult,and the distributed cluster parallel communication is difficult in the model cluster deployment,the communication interaction scheme between the virtualized deployment scheme of the model cluster and the model cluster is designed.In terms of virtualization,based on Docker virtualization deployment technology,an automated construction scheme of model running environment and cluster parallel environment is designed.In terms of communication,in order to solve the problem of virtual model container communication and distributed inter-cluster communication,a virtual network-based inter-container communication scheme and a distributed parallel inter-cluster communication scheme based on scheduler and actuator are designed to realize the distributed model.Rapid deployment and stable communication in the environment.This paper explores the distributed computing method and distributed deployment method of the model from the perspective of computation and deployment of the model in parallel operation.In terms of computation,the model's distributed data storage scheme and distributed parallel computing scheme are designed.In the aspect of deployment,the model's deployment virtualization scheme and communication scheme are designed.It solves the key difficulties of distributed parallel operation of the model and provides a complete solution for parallel computing and cluster deployment of the model in a distributed environment.
Keywords/Search Tags:terrestrial carbon cycle model, parallel computing, distributed network, Spark, HDFS
PDF Full Text Request
Related items