Font Size: a A A

Research On Hadoop Data Mining Technology For Early Warning Of Citrus Diseases And Pests

Posted on:2019-11-10Degree:MasterType:Thesis
Country:ChinaCandidate:T DuFull Text:PDF
GTID:2393330566959511Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Citrus planting industry plays an important role in agricultural planting in Jiangxi Province,and it is one of the key points of Jiangxi fruit industry project.Citrus crops,due to their own growth characteristics,are vulnerable to pests and diseases,and the occurrence of pests and diseases in a short period of time has not been completely eliminated and will rapidly multiply,resulting in large-scale citrus planting disasters,As a result,the yield and quality of citrus decreased.The occurrence of citrus diseases and insect pests is affected by temperature,humidity,soil and other factors.Therefore,by analyzing the relevant data in citrus planting process,we study and improve the data mining method for the diagnosis and early warning of citrus crop diseases and insect pests.Finally,construct an early warning system of citrus pests and diseases.The research results have certain theoretical significance and application value for the diagnosis and early warning of crop pests and diseases in China.In the process of mining related data such as citrus planting,with the increasing of historical data,a single algorithm model can not meet the needs of large amount of data analysis and mining in reality.The integration learning of algorithms and parallel computing of algorithms have become the main research direction of data mining.In this paper,the decision tree mining algorithm is used.The emphasis of the research is the integration learning of decision tree algorithm and the parallel computing of decision tree algorithm in Hadoop.Firstly,this paper introduces the distributed processing system based on Hadoop big data,including the system architecture of HDFS,the running mechanism and the fault-tolerant mechanism.This paper analyzes the parallelization process of algorithms in MapReduce programming model,and introduces the general data processing flow in the process of data mining and the related algorithms of data mining,And detailed description of the collection and pretreatment of citrus pests and related data.Then,the decision tree mining algorithm is studied,which includes three typical decision tree algorithms:ID3 algorithm,C4.5 algorithm,CART algorithm,through the decision tree algorithm integration learning,proposed Random Forest and Gradient Boosting Decison Tree.On this basis,the parallelization of Random forest algorithm based on Hadoop platform and the parallelization of Gradient Boosting Decison Tree algorithm are studied,and the performance of these two parallel algorithms is analyzed and compared through experiments.Finally,the characteristics and rules of citrus pests and diseases are analyzed in detail,and the parallel decision tree algorithm is integrated into the early warning analysis of citrus pests and diseases,and the overall framework of the early warning system of citrus diseases and insect pests is put forward.And the system requirements analysis and detailed design,elaborate the system of the main work modules and data mining system data processing process,and then through environmental deployment,code development,can achieve the main functions of the system.
Keywords/Search Tags:diseases and pests warning, decision tree, parallel algorithm, Hadoop, data mining
PDF Full Text Request
Related items