Font Size: a A A

Study On K-means Clustering Algorithm Based On Summarized Information Of RTVU Students

Posted on:2013-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhengFull Text:PDF
GTID:2268330422952328Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Along with rapidly developping of software and hardware technology, raising ofapplication level and expanding of application scope of the computer, human society’sability in producing data, collecting data and store data increase rapidly,which leads to thepresent trouble that we are so drowned in the ocean of data that we can’t get the usefulinformation hidden in it. Just at this historic moment data mining technology arises.Clustering analysis is an important part of data mining technology, and has been widelyused in image recognition, data analysis, pattern recognition and many other aspects.Clustering analysis means through identifying the dense or sparse area, the computor canfind the global distribution model and the interesting mutual relationship among dataattributes. And in many of the clustering algorithm, K-means (k-means) algorithm hasbeen widely used for its mining algorithm is simple, fast and suitable for medium andsmall scale globular cluster of data.According to the need of the actual teaching practice, the paper aims to accompilshthis project: use the data mining technology to process students’ comprehensiveinformation and use the K-means algorithm to start clustering research on the students.Firstly, puts forward the significance of the study on applying technology of datamining and clustering research to analyze the summarized information of the RTVUstudents, so as to accomplish the final study aim---to offer the exclusive tutorials.Secondly, applies the method of attributes generalization to conductattribute-oriented induction on the current summarized information of RTVU students,and pretreats students’ data.Thirdly, through analyzing the importance of clustering center to the final clusteringresult in K-means clustering algorithm, the paper brings forward a new algorithm withimproved way to choose the cluster center---to choose the center through the farthest point;Directing at the shortcomings that K-means clustering algorithm needs to preset the Kpoint based on experience or fact, the paper puts forward a new algorithm, in which findthe best clustering number according to effective Function. Then proves the two improvedalgorithms through test data. Finally, analyzes the real summarized information data of the students throughclustering algorithm, and gives guiding suggestions for the future teaching according tothe clustering result.
Keywords/Search Tags:Data mining, Algorithm, Clustering, Pre-processing, K-means
PDF Full Text Request
Related items