| With the development of information technology,the speed of data generation become faster,and the data size become bigger,which makes the data screening difficult and affects the efficiency of the mining,resulting in unnecessary losses.Therefore,it is of great significance to find the required data from a large amount of data.Association rule mining is a technology to find closely related itemsets from big data.It is mainly used to filter data quickly and reduce the pressure of subsequent data processing.This paper mainly studies Eclat association rule mining algorithm and improves some problems existing in Eclat algorithm.The main research contents are as follows:(1)Aiming at the problem that too many candidate sets are generated in the process of Eclat mining,which resulting in long mining time,a multi-thread parallel mining algorithm Con-Eclat is proposed.Con-Eclat classifies data sets according to their transaction set prefix,and assigns threads to each classification for mining.This algorithm is helpful to improve the operation efficiency of Eclat algorithm in the mining process and reduce the generation of duplicate transaction sets.By comparing the mining efficiency of different algorithms,the effectiveness of the improved con Eclat algorithm is verified.Experiments show that the improved Con-Eclat can significantly improve the mining efficiency of Eclat on dense data sets,but the mining efficiency of sparse data sets is not significantly improved.(2)Aiming at the problem that Con-Eclat does not significantly improve the mining efficiency of sparse data sets,the way of improving the transaction set storage structure is adopted to improve the transaction set storage efficiency and intersection operation speed,so as to further improve the mining efficiency of Con-Eclat algorithm.A locally sensitive hash bitmap(LBM)storage structure is proposed.LBM combines local sensitive hash and bitmap data structures,which can dynamically adjust the internal data storage structure according to the change of the amount of stored data.LBM can improve the spatial storage efficiency of data and the logical operation speed between data structures.A LBM based Con-Eclat algorithm is proposed,which uses LBM to store the transaction set data in Con-Eclat algorithm,so as to improve the storage efficiency and intersection operation speed of transaction sets in the mining process.Through comparative experiments,it is proved that Con-Eclat algorithm based on LBM can effectively improve the problem that Con-Eclat algorithm does not significantly improve the mining speed of sparse data sets,reduce the consumption of space in the mining process,and improve the efficiency of Con-Eclat in association rule mining. |