Font Size: a A A

Research On Model And Algorithms Of Profit Mining Based On Microeconomic View

Posted on:2006-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:X J XuFull Text:PDF
GTID:2166360155953095Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In this paper we systematically research the profit mining problem frommicroeconomic view. To solve enterprise optimization, we achieve somealgorithms which mainly focus on the problems of item selecting and cata-log segmentation.So far, only few theoretical frameworks for data mining have beenproposed in the literature. The microeconomic framework is considered asone of the most promising of these models. In 1998, Jon Kleinberg et al in-troduced microeconomic view into data mining domain by proposing a the-ory of the extracted patterns. They viewed data mining as action, and theywant to mine out the set of actions that generates maximum revenue byutilizing the information of customers. Microeconomics is the theoretic ba-sis of profit mining. It views the problems which enterprises are faced withas optimization problems and views data mining as an optimization problemeconomically motivated with a large volume of unaggregated data. Datamining is about extracting interesting patterns which are used for makingenterprise decisions from raw data.A major obstacle in data mining application is the gap between the sta-tistic-based pattern extraction and the value-based decision making. KeWang et al first present a profit mining approach to reduce this gap in 2002.Given a set of transactions and pre-selected target items, profit miningbuilds a model for recommending target items and promotion strategies fornew customers in order to maximize the net profit.Profit Mining is the rising aspect and the ultimate object in the litera-ture of data mining and it is based on microeconomic view and pursues thehighest profit by applying the data mining knowledge. Profit mining mainlyfocuses on the problem concerning how to gain the maximize profit forbusinesses and enterprises. Item selection problem and catalog segmenta-tion problem are the key researching domains of profit mining.Item selecting problem is concerning how to pick the set S consisted ofJ items from all given items on the basis of a given set of transactions inwhich each item of transaction is assigned with profits and cross-sellingfactors (csfactor for short) for the maximum profit. The profit of one itemnot only comes from its own sales but also from its influence on the sales ofother items. Some items might not generate a large profit, but they play keyroles for other items. There is cross-selling effect between items, which iscalled cross-selling factor. Although only considering simple factors amongitems, Item selecting problem still is a NP problem. In 1999, Brijs T et al firstly improve product assortment decisions byusing the framework of association rule. They integrated the discovery offrequent itemsets with a (microeconomic) model for product selection(PROFSET). Using frequent itemsets could identify the potential cross-salesof product items and then could generate a better product selection. PROF-SET model only used the conception of support and excluded the concep-tion of confidence. The influence of items for each other is denoted by the cross-selling ef-fect factor shows. This paper discusses three methods to model thecross-selling effect factor between items. After Ke Wang et al study the fa-mous webpage order algorithm HITS, they proposed HAP algorithm anddefined lost rule conception for the first time by introducing the conceptionsof hub and authority And then it was first presented evaluation method ofitem selection. The paper proposes a novel algorithm ItemRank which in-troduces the decaying factor into the customer purchase model on the basisof another famous webpage order algorithm: PageRank. MPIS problem is maximal-profit item selection which is based oncross-selling considerations and customer-oriented to find maximal profit.Another formula is given to identify cross-selling effect factor betweenitems. A heuristic approach is used to solve this problem, and a newFPMPISTree evolved from FPTree is constructed to value the correspond-ing profit. The problem ISM is to find a subset of items as marketing itemsin order to boost the sales of the store, such that marketing means discountof items and free items in ISM. Given csfactor between marketing items andnon-marketing items, it presents a hill climbing approach for this problem.We try to use genetic algorithm to solve item selection problem and definesselection, crossover and mutation operator on the basis of two above prob-lems. The enterprise wants to segment its items into k parts and uses differentstrategies for every part so as to maximize the overall profit. The catalog isa promotional catalog, i.e. a collection of products (items) presented to acustomer with the hope of encouraging a purchase. During catalog segmen-tation, the enterprise also sorts customers into several parts. According toevery part it is designed different interesting catalog so as to use differentapplication for different part. Catalog segmentation problem can be repre-sented the Customer DataBase as a bipartite graph, which includes a cus-tomer set, a product set, and an edge set which denotes the fact that thecorresponding customer is interested in the corresponding product. Thisproblem is NP complete problem. Making a single business decision for all customers does not yield themaximum profit and it is impractical to have a separate decision for everycustomer. It is advantageous to consider the segmented version of this opti-mization problem and to choose a set of k business decision such that ifeach customer is assigned the most profitable business decision, then theoverall profit is maximized. Because of csfactor between items, it is difficultto create catalogs for catalog segmentation based profit. The single mailingproblem addresses how to segment the customers and send each segment adifferent catalog. There are three different algorithms for solving the singlemailing problem. The first algorithm, called indirect catalog creation, uses aclustering algorithm to identify the customer segments and then derives thebest catalog for each segment. The second algorithm, called direct catalogcreation, tries to simultaneously identify both a catalog and its associatedcustomer segment. Finally, the third algorithm, called hybrid catalog crea-tion, solves the problem by combining elements of the earlier two algo-rithms. In 2004 Ester M et al investigated another problem in microeconomicdata mining: customer-oriented catalog segmentation, where the overallutility is measured by the number of customers that have at least a specified...
Keywords/Search Tags:Microeconomic
PDF Full Text Request
Related items