| With the popularity of the network development, communication networks, data generated by the Internet every day to explosive growth. These data are more and more data from the mobile terminal generated. These data include voice, text, pictures, video and other structured and unstructured data. However, for the structured analysis of the data it has a lot of technical methods. For unstructured data processing is still in the research stage, especially in the face of massive unstructured data, how to analyze these data, dig out valuable information has also been the target of large data researchers. So, we will face two questions. First, how will these vast amounts of unstructured data into structured data; Second, the use of what kind of analysis methods to analyze the data, and tap valuable information.In this paper, mainly for customers in the mobile network Jingdong end purchases of unstructured data generated image data source to study these data using association rules Apriori algorithm to make a reasonable analysis of the correlation between the data. To tap the corresponding user personalization items of interest. Implement a single product recommendation, product recommendations bundled accuracy, enhance the influence of the client, to stimulate consumer desire to buy, to maximize profits. The main work includes analysis about several aspects:(1)By the user information on the client browser Jingdong commodity analyzed, to extract a total of 1353 pictures, and these pictures were sorting out the 13 categories. For these unstructured image data processing, write to the XML file obtained structured data stored in the database.(2)Establish a correlation analysis model, a detailed study of Apriori algorithm to analyze the data obtained association rules, then do the correlation analysis. Discovery algorithm for data analysis, there are two problems: 1) the data processing speed is relatively slow 2) association rules are not necessarily consistent with the degree of user interest. For these two problems, the data stage segment, reducing the amount of data for each analysis to improve data processing speed, but also made two association rules inside and outside the data analysis presented InOut-Apriori algorithm. And improvements to the new relationship and do correlation analysis, while the improved algorithm data analysis were compared before and after, show that the new method can solve the above two problems. |