Font Size: a A A

Research On Multimedia Completion Method Considering Data Distribution Characteristics

Posted on:2020-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:D TianFull Text:PDF
GTID:2370330572484361Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,the lack of data often occurs or even is inevitable,incomplete data will affect the statistical analysis of data.If the effect of data completeness is not good,the data information can not be fully and effectively utilized.Therefore,the processing of missing data is a key issue affecting data quality,and the completeness of incomplete data is also a valuable research.Firstly,this thesis summarizes the research methods of incomplete data at home and abroad,and expounds the relevant theories of three kinds of complete methods: statistics,clustering and intellectualization.Secondly,the constructor set experiment proves that the data distribution characteristics have a great influence on the effect of data completeness;secondly,the BP neural network method of data completeness is introduced,which uses DBSCAN density clustering method to classify sample data,analyze its distribution characteristics,eliminate noise data and select training samples,and use BP neural network to fit the non-linear relationship between data attributes and predict the number.Finally,the data of wheat seed and iris flower data sets are processed separately,and a certain observation value is selected as the experimental data,and a certain attribute or some attribute of the complete experimental data is taken as the missing item.The least square method,K-nearest neighbor method,BP network method considering data distribution and BP network method without considering the characteristics of data distribution are used to complete the data processing.Completion experiments were conducted to predict the missing items,calculate the accuracy and compare the completeness effects of the four methods.Through the analysis of the case data set,it can be seen that the BP neural network with the characteristics of data distribution has the best data completeness and accuracy.
Keywords/Search Tags:Complete data, Density clustering, Sample distribution, BP neural network, Machine learning
PDF Full Text Request
Related items