Predicting the Credibility of Market Basket Analysis by using Error Estimating techniques: Application of Micro Ingredient Manufacturing Sales Data |
| Posted on:2013-01-21 | Degree:M.S | Type:Thesis |
| University:University of Nebraska at Omaha | Candidate:Belongia, Dwayne | Full Text:PDF |
| GTID:2458390008984038 | Subject:Computer Science |
| Abstract/Summary: | PDF Full Text Request |
| When one is data mining a small data base, employing cross-validation techniques can show the credibility of the results. Cross-validation is a method of partitioning data into smaller segments in order to increase the quality of a small dataset. Typically the dataset is divided into two partitions. The first partition, usually two thirds of the dataset, is set aside for training. The training partition is used to find the best method to use for this particular dataset. The training partition is further divided into several equal size folds; typically ten folds are adequate. Then each fold is process N number of times, after which the results are combined and then averaged to give one final result. Leave-one-out and bootstrap are techniques of cross-validation in that data is partitioned into smaller units; or folds. Bootstrap employs the statistical process of sampling with replacement while leave-one-out uses N fold just as cross-validation. In This thesis we compare techniques and results of leave-one-out and bootstrap methods with a combined bootstrapleave-one-out method to find the best training data for a market basket analysis of a small manufacturing business database. |
| Keywords/Search Tags: | Data, Techniques, Small, Cross-validation, Training |
PDF Full Text Request |
Related items |