Font Size: a A A

Research And Implementation Of ResDAE-KMeans++ Algorithm In Data Visualization Analysis Platform

Posted on:2022-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2518306323492014Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
The idea of using data to guide production and life has long existed.With the development of civilization,they all show the digitization of society,such as Internet+,innovation-driven,national strategic cloud,smart city,single-domain research,multi-domain integration.Therefore,with more and more data available to the society and more and more problems to be solved,it is becoming increasingly difficult for non-professionals to conduct efficient and detailed analysis of data.Therefore,Data visualization analysis team of Shenzhen Big Data Research Institute designed an online data visualization analysis platform combining data analysis and visualization technology.The purpose of this platform is to improve the efficiency of data processing and reduce the threshold of data analysis and data visualization.The data visualization analysis platform integrates more than 100 kinds of data analysis and data visualization functions.The author participated in the design and implementation of the platform,integrates some data analysis and visualization algorithms into the data platform and encapsulates the underlying call,so that these functions can be provided to users as a service.Users can enjoy the one-stop service of data acquisition,processing,analysis and visualization with simple operation.In the process of platform development and application,as an important machine learning algorithm,unsupervised clustering algorithm plays an important role in data visualization analysis platform.However,the traditional clustering analysis algorithm used in the platform has the problems of low efficiency and poor performance for high-dimensional data processing.Therefore,this paper proposes a new unsupervised clustering algorithm integrated with deep learning,namely Res DAE-KMeans++.In order to deal with the dimension disaster of high-dimensional data,this method is based on the Deep Auto Encoder(DAE)with the Residual Unit,and uses K-means++to cluster autonomously in low dimensional feature space.Compared with the traditional unsupervised clustering algorithm,the feature space extracted by the nonlinear residual autoencoder improves the clustering speed significantly,and the accuracy is also further improved.This method is compared with the traditional unsupervised algorithm on Iris,Wine and MNIST data sets,and the experimental results show that the Res DAE-KMeans++ algorithm has obvious advantages over the traditional clustering algorithm.
Keywords/Search Tags:Data Visualization, Data analysis, Data visualization analysis platform, Clustering algorithm, ResDAE-KMeans++
PDF Full Text Request
Related items