The Research Of Quality Analysis And Evaluation Of Tracks Based On Association Rule Algorithm

Posted on:2017-11-01

Degree:Master

Type:Thesis

Country:China

Candidate:Q C Bai

Full Text:PDF

GTID:2322330491458128

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the improvement of information level in all walks of life, data mining technology has gradually been widely used. Due to the explosive growth of information, the traditional data mining technology is hardly to meet the needs of business. The traditional serial algorithm has the shortcomings of low mining rate, and cannot respond quickly to massive data mining. With the increasing of data size, we will have to encounter more and more complex data structure. Moreover, the dimension of the data will be higher and higher. Facing with the rapid growth of data scale, the traditional Frequent Item set mining algorithms need to scan the database constantly, which will increase the time complexity so as to affect the efficiency of the algorithm seriously. The parallel data mining appears under such a background.Association rule algorithm is an important branch of parallel data mining, which can mining data item sets with association relation from sets. Therefore, the association rules algorithm has a broad applying prospect in various industries. In recent years, Cloud computing platform such as Hadoop has attracted more and more attention of researchers. The parallel realization of traditional Frequent Itemset mining algorithms is becoming an important research direction. There are two bottlenecks in the mining of frequent item sets, namely too many iterations and overload I/O. The Hadoop platform has inherited many advantages of cloud computing and provided an effective strategy for big data distributed storage and parallel computing. Owing to its characteristics of high availability and low cost Hadoop can be used to relieve the pressure.Although the parallel algorithm based on association rules has been widely studied in recent years, there are still drawbacks of lots of candidate item sets exiting in each cycle scanning due to the excessive number of iterations. By studying the working principle of the MapReduce calculation model and its operation mechanism and fault tolerance mechanism, this paper proposed an optimization method on the parallel frequent item set mining based on MapReduce. Moreover, this paper also carried out theoretical design based on MapReduce frequent set mining algorithms and applied the improved algorithm to the rail quality analysis evaluation. By analyzing the rail defected data, strong association rules can be generated. In the MapReduce parallel computing processing, data partition matrix Tk was stored according to the row segmentation. The computational load was spread across all nodes in the cluster, which could reduce the time consumption of the vector multiplication and the moving matrix in each iteration. Finally, this paper analyzed and discussed the algorithm in detail.

Keywords/Search Tags:

MapReduce, Association rule algorithm, Data mining, Frequent Itemset mining

PDF Full Text Request

Related items

1	Data Mining Technology’s Application In Railway Container Freight And Maintenance
2	The Crane Of The Statistical Characteristics Of The Dynamic Mechanical Properties Of Parameter Analysis And Association Rule Mining
3	Alarm Association Rules Mining In Civil Aviation Passenger Service Information System
4	Association Rules Mining And Analysis For High Speed Railway Overhead Contact System Fault Data
5	Research On The Fault Association Analysis For High-speed EMU
6	Analysis Of Road Traffic Accidents Based On Data Mining Approach Of Association Rules
7	Research On Civil Aviation Hazard Management System And Its Key Technology
8	Research On Traffic Congestion Prediction Based On Sequential Association Rule Mining
9	Research On Approach To The Generation Of Frequent Itemset Based On Attention And Its Application In Network Management Of OAS
10	Research On Eclat Algorithm Based On Flink Platform And Its Application In EMU Fault Association Mining