Subdivision Of IM-softwareâ€™s Users In Telecom Field Using Cluster Algorithm

Posted on:2017-02-28

Degree:Master

Type:Thesis

Country:China

Candidate:M Sun

Full Text:PDF

GTID:2308330485478994

Subject:Operational Research and Cybernetics

Abstract/Summary:

PDF Full Text Request

The age of Big Data, the value of data becomes more and more attrac-tive. Data Mining, as an important means,is widely used. Finding latent value from data is also becoming the significant productivity in all trades and profes-sions by data mining. In order to discover the regular pattern and high-value customers under the data and give different service, this paper partitions the customers of IM-softwares using clusterâ€™s algorithm of Data Mining accord-ing to the visited times and used flow of these software.As the middleman linking the understandings of business and data and building model, designing algorithm, data preprocessing is the important phase of data mining. Data preprocessing will directly influence the result of cluster-ing. Better result can not be obtained, if adequate understanding and analysis and processing for original data is not achieved before data mining. So as to combine the requirement and algorithm perfectly, the paper gives the data preprocessing for original data grounding by the understanding of business and data and get the final data for clustering. The paper introduces the data preprocessing in detail.The K-means and bicluster, which is on the basis of Large Average Subma-trices(LAS), are two cluster algorithms in this paper.Firstly, the paper chooses the traditional K-means clustering algorithm for fractionizing the data accord-ing to the feature of data, and displays and interprets the result of clustering. When fractionizing the date using biclustering algorithm, paper modifies the algorithm and the score function S(Â·) grounding by the model of Large Av- erage Submatrices (LAS), which is proposed by Shabalin in 2009, and feature of the data.After having been modified, the algorithm and score function are nice to the data. The biclusters of biclustering can interpret the demand of business. The improvement to the algorithm reduces largely the complexity of it. The improvement to the score function not only makes it fit to the data and reduces the complexity of the algorithm, but also the important point is base on that we can choose the difficult parameter by the feature of difficult data set, which make the total algorithm more smart.

Keywords/Search Tags:

K-means, Bicluster, LAS-Model, Data Preprocessing

PDF Full Text Request

Related items

1	Data Preprocessing And K-Means Clustering Based Support Vector Regression Model
2	An Improved Bicluster Algorithm And Its Application
3	Bicluster Analysis Of Heterogeneous Panel Data Via M-Estimation
4	Study On Data Preprocessing Techniques In Rfid Complex Applications
5	Study On Data Preprocessing Techniques In RFID Complex Applications
6	The Design And Implementation Of Bicluster Data Analyzing Software
7	Research And Application On Data Preprocessing System Of Mobile Internet Data
8	The Research And Application In Text Clustering Of K-Means Algorithm
9	The Study On Data Mining Algorithm And Application In Web Log Analysis
10	Design And Implementation Of Data Preprocessing System Oriented To Data Mining