Font Size: a A A

The Application Of Data Mining Method In Quality Identification Of Wine

Posted on:2015-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:H B ZhangFull Text:PDF
GTID:2181330452451229Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The grape wine is becoming increasingly popular due to the robust economic growth, however, the identification of wine in current situation still depends largely onthe artificial tasting of wine tasters, which has been difficult to meet today’s hugemarket demands. With the big data concept deeply rooted in peoples’ mind, scientifictesting means concening the physicochemical properties of wine emerged accordingly,which provide great support to the application of data mining in wine qualityidentification. This text tries to identify the wine quality under the guideline of datamining method, basing on the physical and chemical properties data of grape wine.At this stage, the use of data mining to identify the quality of the wine is still rare,of which the common problem is that even these classification models boast highaccuracy rate on a whole, with regard to the low quality wine, it is relatively low.While in this paper, the Logistic and multinomial model, Tan neural network, BPneural network with the error term, coupled with C5.0decision tree serve as thetheoratical framework. Meanwhile, the paper not only focuses on the predictionaccuracy of the overall quality, but also analyses in depth the specific accuracy ofeach category. As a result, the author draws a conclusion that the imbalance dataclassification ignores a few categories in spite of the high overall accuracy rate.Moreover, the paper adopts a combination of SMOTE oversampling andundersampling to balance the data and select the optimal decision tree classificationmodel. And to further improve the accuracy of prediction and better relate to theunequal misclassification cost, this article will combine Boosting decision treetechnology and cost-sensitive learning. By doing this, it not only improves theaccuracy of judgment compared to the original classification but also greatlyimproves the identification of the low quality wine while reducing the cost of falsepositives.
Keywords/Search Tags:Wine Quality Appraisal, Data Mining, Sorting Algorithms, ImbalancedData, Boosting, Cost-sensitive Learning
PDF Full Text Request
Related items