Font Size: a A A

E-commerce Product Quality Based On User Reviews Risk Assessment And Early Warning Research

Posted on:2019-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:K Y MaFull Text:PDF
GTID:2429330551961571Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Since the globalization of the economy,the popularization of Internet has really affected people's lifestyle.Online shopping has already become a part of our daily routine.Owing to Tmall,JD,and other e-commerce sites,customers can buy anything from all over the world at home.However,there are many inferior,shoddy and even fake goods on these websites,which may possibly enter into our daily life.To solve this problem,this paper will propose a new risk forecast model which based on the previous research.The model uses Bayesian algorithm improved with minimal hash value.It is adapted for distributed parallel computing on Spark.The dataset of the model comes from the comments from the customer.Comparing with previous work,this model is more efficient as the evaluation method could be done with lower cost and better result.Besides,the model can identity the potential risk and mark it as a certain level.Thus,the government and the website holder could have much more time to prepare prevent the crisis happening.The main contents of this paper are as follows:First,to get the original comments from the customers,the model contain a self-developed program to grab the comments from Tmall,JD,Amazon and other major e-commerce supplier website.This paper will demonstrate the basic principles of data mining and compare several useful algorithms for risk forecast.Second,this paper will propose an improved Bayesian algorithm based on the Minhash algorithm.After reviewing current algorithms of Bayesian classification and Bayesian network,it could be found that there is no proper algorithm adapted for both products' quality risk forecasting and public sentiment forewarning.The features of online products usually interconnected,which means the distributions are not independent.Therefore,the Minhash algorithm can be used to calculate the correlation between the feature and the level of the risk.The result can be used as the weight of the feature which makes the model more suitable for risk forecasting of online products.Third,to rise to the occasion of the Big-Data,the model uses Spark to process the data.This paper will illustrate the background of Spark and go all the way from the basic principle to the derivatives of Spark(i.e SparkSQL,SparkStreaming,MLlib and GraphX).In this paper,a simulation experiment has been done on Ubuntu 16.04 to test the algorithm's performance.Take "anti-ultraviolet clothing" as an example,this paper will demonstrate the process of comments grabbing,data pre-processing,attributes extraction,high-dimensional data reduction and model improving.The example compares the efficiency and accuracy of the forecast and shows how this research work for both government and the website holder and give them some advice.
Keywords/Search Tags:online product sales, quality supervision, risk assessment, public sentiment forewarning, Spark, user comments mining
PDF Full Text Request
Related items