Recognition Of The Quality Of Wine Based On Data Mining

Posted on:2011-12-05

Degree:Master

Type:Thesis

Country:China

Candidate:C X Lin

Full Text:PDF

GTID:2191360305494746

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

With the development of the domestic wine industry, both the number and the scale of wine companies is increasing. However, China's wine industry is still facing the fierce competition against imported wines as well as the market disorder caused by the shortage of the quality assessment system. To solve these problems, we in this paper first discussed the deficiencies of wine artificial taste, and then proposed a methodology to improve the recognition rate of wine quality by means of data mining techniques. It may benefit the quality control of wines as well as the stable advancement of Chinese wine market.In the field of data mining, the analysis of unbalanced data is the common case. Compared to the class with more samples, the influence of minor class on the prediction accuracy is smaller. When all samples are classified with high accuracy, the samples of the minor class may be not recognized. And the classification rule of identifying the minor class will be ignored. The innovation of this paper lies in modeling the balanced samples extracted from the unbalanced ones, and then using the model to predict the test samples. By repeating the process for N, e.g. 1000 times, we make the final prediction by voting. The method improved the recognition rate of low-quality wine greatly.Based on such sampling, discriminant analysis, support vector machines, classification and regression trees and random forests used in the recognition of wine quality were compared in this paper firstly. Among these methods, random forests achieve the best performance in terms of higher overall recognition rate and rate of identification of low-quality wine. Moreover, the random forest model was shown to be stable. Secondly, the average importance of all variables can be otained by using random forests. The variable importance ranking told us that potassium sulfate and the alcohol are important factors influencing the quality of wines. This means that the increment of the potassium sulfate and/or the alcohol tends to result in a higher quality wine. The variable importance ranking also helps brew higher quality wine. Finally, the outlier detection method is applied to detect the samples of low-quality wine. Unfortunately, the identification of low-quality wine is poor, only 30% samples of low-quality wine are identified. So the outlier detection can only complement and improve the results of wine reference identification. But the result shows that the outlier detection improved the identification rate.

Keywords/Search Tags:

Wine Quality Recognition, Discriminant Analysis, Support vector machine, Classification and Regression Trees, Random Forests, Outlier Detection

PDF Full Text Request

Related items

1	Spot Welding Quality Classification Based On Support Vector Machine Model
2	Designing An Expert System For EOR Screening Based On Machine Learning Classifiers
3	Quality Analysis Of Edible Olive Oil By Chemometrics Methods And FT-IR,GC-MS
4	Research On Fabric Defect Recognition And Classification Algorithms Based On Wavelet Analysis And Support Vector Machine
5	Ensemble-Based Robust Chemometrics For Analyzing Metabonomics Dataset
6	Study On The Prediction And Assessment Methods Of Water Environment Quality Based On Support Vector Machines Theory
7	The Application Of Several Data Mining Algorithms In The Classification Of Ceramic Materials
8	The Outlier Detection Of Chemical Data And Application
9	Support Vector Machine Algorithm Is Applied To The Biological Activity Of The Mixed System Level Classification Of The Spectrum Of Quantitative Analysis Of Heavy Elements
10	Application Of Chemometrics In Complex System Of Traditional Chinese Medicine Analysis