Font Size: a A A

Research On Filling Methods For Missing Data Of Bamboo Germplasm Resources

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2393330602996827Subject:Agriculture
Abstract/Summary:PDF Full Text Request
Bamboo plants have gradually become a substitute for wood materials due to their fast growth,high yield and high strength.They occupy a very important position in the field of engineering raw materials.Bamboo germplasm resources are an important basis for mining and cultivating excellent bamboo plants.However,in the collection and storage of bamboo germplasm resources,there are inevitably some missing data,which brings great challenges to the data mining of bamboo germplasm resources.Based on the above background,in order to solve the problem of missing data in the data mining process of bamboo germplasm resources,this paper takes the incomplete data of bamboo germplasm resources as the research object,based on AP clustering and KNN algorithm,focusing on solving the lack of bamboo germplasm resources Two key problems in data filling-dynamic processing of bamboo species data missing problems and bamboo species missing data KNN filling set K value problem,using IAP clustering to dynamically update the clustering results and improving KNN filling,developed based on Spring Boot framework bamboo germplasm resource data analysis system.main tasks as follows.(1)Analyzed the data cleaning technology and process,and discussed the problem of missing data and the method of cleaning missing data.Firstly,the reasons for data loss,data loss mode and data loss mechanism are described in detail,then the common cleaning methods for missing data are compared and analyzed.Finally,the advantages and disadvantages of AP clustering and KNN algorithm are compared.Based on AP clustering Stability performance and good performance of KNN algorithm,it is concluded that KNN filling method is the most suitable cleaning method for missing data of bamboo germplasm resources.(2)The method of filling data for missing bamboo germplasm resources was studied.On the basis of AP clustering and KNN filling algorithm,a method for filling missing bamboo species based on IAP-SKNN is proposed.First,IAP clustering is used to dynamically update the clustering results to make full use of the complete information of the data,and then the KNN filling is improved to make It can quickly converge the filling data without setting the K value.Finally,through experiments,compared with the other five KNN padding algorithms,the average padding error is reduced to varying degrees from 16.5% to 53%.At the same time,the complete data of bamboo species with a missing rate of 15% was selected for classification,and the classification accuracy reached 93.25%,which verified the good filling performance of IAP-SKNN.(3)Developed a bamboo germplasm resource data analysis system.Using Spring Boot 2.0 framework,combined with My SQL database,Echarts open source visualization plug-in and Elasticsearch open source full-text search engine,etc.,based on IAP-SKNN algorithm,designed and developed bamboo germplasm resource data analysis system,including bamboo species data entry management module,bamboo species Data cleaning module,bamboo data analysis module and system user management module.The system is connected to the bamboo species database in real time to process and analyze the missing data of bamboo species,and an example verifies the effectiveness of the method and technology.It avoids the difficulties caused by the lack of data analysis and mining of bamboo species data,and helps the information system efficiently analyze and manage bamboo species data.
Keywords/Search Tags:bamboo plant, germplasm resources, data cleaning, missing data fill
PDF Full Text Request
Related items