| With the rapid development of Internet technology,website browsing and information releasing become indispensable part of people's online life.At the same time,the major websites and forums have accumulated huge amounts of data on Internet public opinion,which contains some value of mining and analysis.The Internet public opinion monitoring,data analysis and any other related jobs have been included in part of the enterprise digital strategy.For the auto company,Internet public opinion data analysis is continiously showing the application value.In order to develop new area of marketing,the auto company depends on accurate positioning and meeting the demand of consumers by releasing new vehicle models.Firstly,traditional data analysis relies on the information of statistical yearbook,industry management data and field marketing surveys.It always exists the shortcomings such as limited volume of sample size,delay,and low accuracy.Besides,it is difficult to meet the needs of the data analysis depending on traditional data collection and statistics in the big data environment.The lack of automatic data analysis tools and effective data methods have become the pain point of business development.In addition,nonaccumulation of the data is leading to the serious lack of data reusability and analysis method development.To collect and accumulate Internet public opinion data from the auto website and then construct an efficient and reliable data analysis system to support the auto company's enterprise strategic decision by data visualization has become the key solution of these problems.This paper firstly describes the research background and the status of Internet public opinion data analysis and introduces the research contents.Secondly introduces the methodology of Internet public opinion data analysis,text mining and introduces the related algorithms and techniques.Based on the current status,the paper raises an overall solution.Then starts to discuss the research and experiment on the sentiment analysis,the marketing insight analysis and the new word detection.For the sentiment analysis,the paper briefly introduces the process of the Internet public opinion data preprocessing,including data blending and cleaning,text conversion,text clause and participle.And the principle and execution method of text clause and participle has been emphatically introduced.According to the characteristics of the auto industry,an auto text corpus is built.The paper raises a process of public opinion sentimental analysis by combining the word corpus matching and machine learning algorithm of Spark and emphatically introduces machine learning algorithm training and evalution process.After training and testing some text classification algorithms such as Decision Tree,Logistic Regression,Naive Bayes and etc.,the algorithm which has the better result is used to construcing the system as the result of F-measure evalution.The auto marketing insight analysis and the new word discovery three data analysis themes,this paper studies the analysis methods of Internet public opinion data.Prior to this,we,including collection and,text clause and participle,the construction of thesaurus.According to the emotion analysis of public opinion,In the auto marketing insight analysis,this paper raises methods of the vehicle competitive relationship,vehicle purchase comparison and decision path analysis.Using the word corpus matching to do the data process job.Based on data warehouse model,designing and constructing the subject areas logical model.At last,the paper provides the data visualization solution with variety charts to show the result of data analysis.In order to solve the difficult of text corpus maintainance,the paper combines Hidden Markov Model,Informationentropy and Regular Expression to research the new word detection which based on Internet public opinion of auto websites.Achieving the supplement of the text corpus and the insight of the network popular language.Provideing the maintenance convenience to the word corpus administror and the solution for the text corpus management.This paper introduces the design and implementation of data analysis system based on auto Internet public opinion.And finally completed the construction of the system.The system provides business users with effective and responsible data analysis results.It also helps auto companys to understand the needs and preferences of consumers and insight the competitive landscape of vehicle products in order to supply the decision support of the vehicle product planning and positioning in the future.The system has prove its application value in the development of enterprise digital transformation. |