Font Size: a A A

Mining Association Rules In The Bigdata Of Guizhou Cigarette Brand

Posted on:2018-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:S YuFull Text:PDF
GTID:2381330515997785Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
In Guizhou tobacco project,it contains a large number of cigarette sales data.How to make good use of these data,play the potential value of the data,to achieve accurate marketing of cigarettes in the tobacco industry is a urgent and challenging problem to solve.In view of this issue,this paper uses data mining and Hadoop large data technology to study the corresponding solutions.In this paper,based on the specific demand of precision tobacco marketing in Guizhou,the Apriori and FP-Growth algorithms were used to mine the cigarette brand association rules with the latest five years's sales data.At the same time,the Hadoop large data technology was used to study the association rules of cigarette brand association rules.Stand by.The work and achievements of this paper are as follows:1)Mining cigarette brand association rules with R tool,including demand analysis,data pre-processing,and the use of R scripting to achieve the cigarette brand frequency calculation and association rules mining.Exploring a number of practical association rules for the cigarette Brand sales and putting forward the bundled sales strategy and constructive comments.2)Hadoop technology for cigarette brand association rules mining research has been carried out.At the same time,we carry out the experiments of mining association rules in a stand-alone environment and distributed environment,respectively.In the stand-alone environment,the R tool is used for the correlation analysis,and in the distributed environment using the Mahout which can run on the Hadoop cluster environment to compulate,and the performance comparison experiment is carried out.The conclusion is drawn that the performance of the program in the distributed environment is obviously better than that in the stand-alone environment when the data volume reaches a certain degree.3)The design and implementation of the data analysis platform of the tobacco business is completed,and the application of cigarette brand association rule mining is carried out to verify the validity and practicability of the paper method.
Keywords/Search Tags:Association rules, cigarette, Hadoop, R, FP-Growth algor ithm, Apriori algorithm
PDF Full Text Request
Related items