Applied Research In Sentiment Analysis With Machine Learning Model Based On Gradient Acceleration

Posted on:2024-08-24

Degree:Master

Type:Thesis

Country:China

Candidate:S W Liu

Full Text:PDF

GTID:2568307073976669

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

In order to solve the problem of how to solve the model faster while ensuring a high accuracy of model learning in the large-scale sentiment analysis scenarios,from the sentiment analysis application of large-scale Twitter comment text this thesis has accomplished the following three tasks:For the task of text vectorization of English comment text on Twitter,after collating the Twitter sentiment analysis dataset from Kaggle website,first remove the part of non-English letters by text cleaning,then achieve text pre-processing through tokenization,stemming,lemmatization,and finally finish the work of feature extraction and feature selection.In this thesis,text feature vectorization is started from bag of words model,N-gram algorithm is added on this basis,and the binary word segment model with N=2 is selected according to the effect,then TF-IDF technology is used to weight the frequency of word segments,and the top 2000 word fragments with the highest importance are selected to form a key gram list as the final text feature.For the task of solving machine learning models for large-scale sentiment analysis using gradient acceleration algorithm,three machine learning models,Logistic Regression,Support Vector Machine,and Naive Bayes,are selected according to the characteristics of large-scale sentiment analysis.On the basis of the stochastic gradient descent algorithm,based on two acceleration strategies for gradient direction,namely momentum acceleration and variance reduction,this thesis combines the advantages of the two to improve the minibatch gradient descent algorithm and proposes a mini-batch gradient variance reduction algorithm with momentum acceleration,and applies it to the Twitter comments sentiment analysis tasks.The application results show that the improved mini-batch gradient acceleration algorithm effectively reduces the number of iterations and accelerates the learning process.For the task of comparing the performance and time cost of different machine learning models based on gradient acceleration algorithms in practical applications,three machine learning models are selected to be trained separately in this thesis after randomly dividing the training and testing sets in the ratio of 8:2.The results show that the SVM model has the best fitting effect both in training and test sets,but its training time is also the longest;the Bernoulli NB model has the shortest training time,but the worst fitting effect among the three models;the LR model balances the training time and the fitting effect,and both the training time and fitting effect are between them.Therefore,this thesis concludes that LR model is the most suitable for application in Twitter comment sentiment analysis.

Keywords/Search Tags:

Large-scale sentiment analysis, Machine learning, Gradient acceleration, Improved mini-batch gradient acceleration algorithm

PDF Full Text Request

Related items

1	Research On The Generation Of Adversarial Example Based On Batch Gradient
2	Research On Adversarial Sample Generation Method Based On Gradient Masking
3	A Research Of Stochastic Gradient Descent Algorithm
4	Research On Variance Reduction Gradient Algorithm For Large-scale Data In Machine Learning
5	Research And Implementation Of Distributed Machine Learning Acceleration Component Based On RDMA Batch Operation
6	Design And Implementation Of Machine Learning Supporting Platform For Big Data
7	Research On Acceleration Technology Of Large-scale Scene3D Reconstruction
8	Improved Distributed Gradient Descent Optimization Algorithms
9	Fast Support Vector Machines Classification Algorithm With Additive Kernel
10	Improvement Of Logistic Regression Algorithm And Research On Parallelization Based On TensorFlow