Research Of Frequent Item Data Mining Algorithm Based On Hadoop

Posted on:2015-11-04

Degree:Master

Type:Thesis

Country:China

Candidate:L Yang

Full Text:PDF

GTID:2298330467986739

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The carrier of information, the information content of the data, the data is generally considered Computer systems. Using processing data base, extracting information is information systemâ€™s basic functions. In todayâ€™s highly information-oriented society, the network can be said is the largest information systems, where the data is huge, diverse, heterogeneous, dynamic enterprise characteristics. How quickly from large amounts of data to extract useful the information programmers in the application development process, has become a serious problem. Data mining technology, the emergence of cloud computing has brought new opportunities. Cloud computing cloud distributed across multiple nodes of cluster storage and computing power. By deploying a large number of cheap ordinary PC and allows massive data storage and analysis, clusters vary in size, but also with respect to high-performance computers, many ordinary PC to be cheap, so the cost is reduced. The cluster servers are used to reduce business costs. Such storage costs and computing costs have reduced, making the cloud-based data mining of large data gradually become possible. In the Hadoop as a open source cloud computing software, its efficiency, scalability, low cost, has been widely used in data mining field.In this paper, based on the integration of Hadoop and other data mining system, and select the typical Apriori algorithm, which is a new data mining system algorithm modules are widely used in improving willing to handle massive data, improve its efficiency. Used in this article covering methods include:literature, structured approach, case study method, which is instructive analysis of cloud-based Hadoop data mining system architecture. This paper describes the traditional apriori algorithm and the improved algorithm is feasible, an example of the implementation process.Typical combing with Hadoopâ€™s data mining system architecture and integration, bringing Hadoop-based data mining architecture of the system, each functional module brief expouds. Apriori algorithm developed Aminâ€™s massive data processing bottleneck, the use of MapReduce programming model, based on the ideas put forward on the basis of database partitions simultaneously improved.

Keywords/Search Tags:

Data Mining, Frequent Item, hadoop, Apriori Algorithm

PDF Full Text Request

Related items

1	The Improved Apriori Algorithm Based On Hadoop Calculation Model
2	Research On Correlative Algorithms Of Association Rule Mining
3	Research On Association Rule Algorithm In Data Mining
4	An Improved Method Of Apriori Algorithm Based On Hadoop
5	Research On Mining Algorithms Of Maximal Frequent Item Sets
6	Study On Mining Algorithm Of Target Frequent Itemsets And Appliction
7	Search Of Algorithms For Mining Maximum Frequent Item-sets
8	The Application Of Association Rule Mining In Viral Genetic Data Analysis
9	Research And Improvement Based On Apriori Algorithm And Its Application In Wisdom Endowment
10	Algorithm Of Mining The Frequent Itemsets And Research Of The Data Mining Applications