| With the rapid development of the Internet and the widespread use of data mining technology,data processing technology has become an indispensable part of people's lives,especially in the securities market.The security market is an important part of the national economy,the stability of the stock market is not only related to the prosperity and development of the national economy,but also related to the interests of the majority of investors in the securities market.The research of the user data is helpful for the securities company to carry on the customer segmentation and accurate operation.However,the characteristics of the user data in the securities market,resulting in many data mining methods are not ideal.Most of the user data in the stock market are multidimensional attributes data,each attribute has only one attribute value,that is,the same attribute of different attribute values can not appear in the same frequent items.However,the Apriori algorithm ignores this feature,but directly join the operation according to the frequent attribute set,so as to generate a large number of invalid candidate itemsets.According to the characteristics of different data,the data mining algorithm should be optimized and adjusted,so that the algorithm is optimal.In this paper,by analyzing the bottleneck of Apriori algorithm,an improved method of Apriori algorithm is proposed.In this method,a kind of data storage method,which is similar to matrix addition,is redefined and the connection steps of generating candidate itemsets are optimized in order to avoid generating more invalid candidate itemsets and improve the efficiency of the algorithm.The pruning operation is to use a subset of frequent itemsets and frequent itemsets,and delete all items that contain non-frequent subsets.By using the improved algorithm,the data of the existing securities user management system is analyzed,and the relationships among the data are obtained.These rules are of great help to the security analysis.An example is given to verify the feasibility of the improved Apriori algorithm.In the process of comparing and analyzing the Apriori algorithm,the influence of the two factors,such as the amount of data and min_support,on the execution time of the algorithm is defined.After testing the performance of the original algorithm and the improved algorithm,it is found that the improved algorithm is less efficient.However,the improved Apriori algorithm is not perfect,and there are still some shortcomings in reducing the number of database scanning and storage space,avoiding more invalid candidate itemsets,which will be the focus of future research. |