Font Size: a A A

Analysis Of Technology Stocks Based On Multi-source Heterogeneous Data

Posted on:2021-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z WangFull Text:PDF
GTID:2439330611999037Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The stock market plays an important role in a country's national economy.People can often roughly measure its economic development status from a country's stock market,and the study of stock market fluctuations has become increasingly practical.It not only helps to understand the country's macroeconomic situation,but also helps regulatory authorities to issue relevant regulations in a targeted manner.With the successive establishment of the Shanghai Stock Exchange and the Shenzhen Stock Exchange,China's stock market has gradually developed into a more mature system from scratch,which plays an important role in the rational allocation of financial resources in the market.Efficient market theory believes that any information on the market will affect the fluctuation of stock prices,so it is necessary to integrate as much information as possible from different sources when predicting stocks.In general,the information on the market can be divided into two categories,the fundamental information of the stock,that is,the historical data of the stock,and the media information,including news events and stockholder comments.They have different sources of information,and data structures are often different.Stock fundamental information is structured digital form,while media information is usually unstructured text data.The usual research is usually based on one of fundamental information and media information to analyze the relationship between two kinds of information and stock market fluctuations,or use both of these information at the same time,but save two kinds of informationin the form of a vector,ignoring the interaction between data from different sources relationship.How to effectively integrate multi-source heterogeneous data is still a difficult point.This thesis proposes a kind of stock market analysis method based on multi-source heterogeneous data.Firstly,the second-order tensor is used to store stock fundamental information and media information.Compared with vector storage,tensor can better capture the interaction between the two.Then an event-driven convolutional threshold linear unit(Conv GRU)is used,that is,an event-driven factor determined by news events is added to the traditional convolutional threshold linear unit,thereby enhancing the influence of news events.The media information is mainly news and comment data.Because they are all text data and cannot be directly processed,this article first uses the sentiment dictionary method to analyze its sentiment polarity,and uses the SO-PMI algorithm to expand the "sentiment analysis use" proposed by How Net."Vocabulary" makes the dictionary more suitable for the processing needs of network texts.In addition,this thesis also uses the CNN-LSTM mixed model,that is,the convolutional layer and the pooling layer are used to extract features before the LSTM model,whichis extended into a vector and then input into the LSTM model,to classify the sentiment of news and comment data,and finally compare the two Classification effect.In the empirical analysis part,this article conducts experiments on the two kinds of Shanghai Stock Index and individual stocks.In order to facilitate comparison,the method of controlling variables is used to test the effectiveness of the tensor storage method and the event-driven mechanism.Finally,it is found that in the field of stock forecasting,the improved tensor-based GRU model is superior to the vector-based model,and the event-driven mechanism also has a positive effect on stock price prediction.
Keywords/Search Tags:multi-source heterogeneous data, sentiment dictionary, SO-PMI algorithm, correlation analysis, GRU
PDF Full Text Request
Related items