Research On Stocks Data Analysis Based On Spark MLlib

Posted on:2020-09-26

Degree:Master

Type:Thesis

Country:China

Candidate:H Gao

Full Text:PDF

GTID:2370330578968899

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Stock is a major part of a country's economy and society,but stock price is deeply influenced by economic environment,national policy,domestic and foreign environment,so it is difficult to predict the trend of stock and stock price.In addition,It very difficult to analyze the stock data for the randomness of the stock market,the asymmetry of information and the herd mentality of investors.However,the research on financial market and stock analysis has always been the focus of research.In order to analyze the trend of stock price more accurately,this paper proposes a wavelet denoising method for stock trading data,calculates and collects the commonly used technical indicators and emotional factors in stock market,and makes principal component analysis for these factors data,then classifies and analyses them by machine learning after preprocess.After classifying and analyzing stock data with logistic regression,support vector machine and random forest in machine learning,combined voting is carried out by using this predicted results,and a comparative experiment is conducted.The experiment shows that after noise reduction and dimensionality reduction,We can obtain better results through combined voting analysis.Stock price prediction is also one of the highlights of the present study,This paper uses long short-term neural network to predict stock price.In LSTM,sliding time window is used to make short-term prediction of stock price.Then,under the same number of iterations,the experimental results of LSTM with noise reduction and dimension reduction data are compared with those of raw data.Finally,we using the average absolute error,the mean square error,the root mean square error and the percentage of the average absolute error to analyze the error of this network.After that,Spark MLlib distributed learning is applied to experiment these two algorithms in cluster,comparing the performance differences between single-machine environment and cluster environment.Because the dataset itself is small,although it has been improved in time,it cannot fully reflect the advantages of Spark.In the future,with the progress of natural language processing and text analysis and quantification of stocks,when the set is large,Spark's advantages will be further reflected.

Keywords/Search Tags:

Stock Analysis, Principal Component Analysis, Combination Voting, Long Short-term Memory Network

PDF Full Text Request

Related items

1	Portforlio Selection Based On Long-short Term Memory Neural Network
2	Application Of Long-Short Term Memory Based On Wavelet Analysis In Stock Index Forecasting
3	Research On Meteorological Prediction Based On Long Short-term Memory Network
4	The Study Of Stock Market Prediction Based On Deep Learning Networks
5	Forecasting Of Ionospheric TEC Using Long Short-Term Memory Network
6	Application Of Long Short-term Memory Network In Short-term Rainfall
7	Monitoring The Short-term Trading Of American Airlines Stock Based On The Improved EWMA Control Charts
8	Reconstruction Of Central Arterial Pressure Signal Based On Long Short-term Memory Network
9	Financial Time Series Prediction Based On Multiscale Decomposition And Long Short-Term Memory Networks
10	Research On Short Term Forecast Of Fog Based On Deep-Learning