Font Size: a A A

Research On Methods Of Soft Set Forecasting Based On Text Data

Posted on:2017-02-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:D L YangFull Text:PDF
GTID:1319330536450930Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In big data age,text data is a vital way for people to communicate with each other.The company will post recruitment advertising and offers using text data.The news agency will describe the on-going events with the help of text data and the public will express opinions and elicit emotions using text data.All this indicates that the text data contains a lot of value for companies and individuals.To extract information from text data is a way to acquire competitive advantage in big data era.The research of how to predict based on text data is a way to extract information from the text data.But the natural language character and inexact nature which are parts of uncertainty characters are obstructions for predictions based on text data.So it is necessary to find a proper theory which is suitable for uncertainty and develop corresponding forecasting method based on it.The soft set theory is one of advanced methods for dealing with uncertainty.It derives from the study of the approximate description problem,regards finding approximate solutions as it's fundamental idea,uses the parameterized family of sets to describe problems,focuses on constructing the imprecise decision models and provides approximate solutions.From the aspects of the fundamental idea,the way of describing problems and the solution,the soft set theory is suitable for the basis of constructing a forecasting method with the capability of dealing with uncertainty.Finding the combination of forecasting problems based on text data and the soft set theory and establishing the soft set forecasting method based on text data will offer a reliable tool for companies and individuals to discover,acquire and absorb value implied in text data.This paper studies the soft set forecasting method based on text data from the following three perspectives.Firstly,the paper investigates the soft set feature selection method based on text data(FSST).The feature selection stage is an important stage in forecasting based on text data.Thus,the paper constructs FSST to deal with inexact relationship between features and solve the feature selection problem.The proposed method constructs a new soft set model based on equivalence class,that is the paired relationship soft set(PRSS),and furthermore raises the approximate soft set,the dependency degree soft set and the indiscernibility relation soft set to handle inexact relationship between features.The paired relationship soft set eliminates the redundancy caused by the former equivalence class based soft set model(NSS),and calculates the dependency degree through matrix form.Thus,it improves computing efficiency.In the example analysis section,the paper introduces the execution process of FSST.Then the paper compares FSST with the feature selection method based on NSS using 16 sample database.The results show that FSST manages to keep accuracy and extensibility while it improves computing efficiency.Secondly,the paper explores the soft dependency forecasting method based on text data.The method inherits advantages of the soft probability,the soft condition probability and the soft dependency to handle nature language characteristic and inexact characteristic,which means it regards the forecasting task as a whole process,dynamically updates when data set changes,does not need strict assumptions for probability stationary and constructs inexact models to gain approximate solutions.This paper introduces concepts of the soft probability,the soft condition probability,the soft estimation and the soft dependency,describes the problem to be solved and constructs the soft dependency forecasting model based on text data,the feature soft set model and the dependency soft set model.These models are used in constructing the forecasting method.The soft dependency forecasting model builds the relationship between the soft dependency and the soft set forecasting problem which is based on text data and without considering the time lag effect.The implementation of the soft dependency forecasting model relies on the feature soft set model and the dependency soft set model.The feature soft set model integrates FSST.It can deal with inexact relationship between features and accomplish the task of transformation from text data to vector space model.The dependency soft set model calculates the soft estimation and completes forecasting.For problems of empty sets caused by the dependency soft set model and superabundant features during forecasting,the paper introduces methods of searching approximate set and using heuristic algorithm.For evaluation on the soft estimation,the paper defines three soft estimation error metrics,that are the error soft mapping,the single error soft mapping and the total error,and introduces two error metrics between points and sets which are needed by the soft estimation error metrics,that are Theil inequality coefficient based on Hausdorff distance and another Theil inequality coefficient based on minimal Manhattan distance.In the example analysis section,the implementation of the soft dependency forecasting method is introduced.In the application analysis section,the method is used to predict the current share price volalility of ten companies from 8-K reports.Then the paper analyses advantage and disadvantage of the method and compares it with other forecasting method qualitatively.The results indicate that the soft dependency forecasting method based on text data is able to support the soft set forecasting task based on text data and without considering the time lag effect.Thirdly,the paper looks into the soft sequence dependency forecasting method based on text data.The method utilizes the soft sequence probability,the soft sequence condition probability and the soft sequence dependency,and covers the shortage of the soft dependency forecasting method based on text data,which fails to handle the problem of the time lag effect.Because the soft sequence dependency is the extension of the soft dependency to the situation of a sequence,the soft sequence dependency possesses the same advantage in dealing with nature language characteristic and inexact characteristic.The paper defines the soft sequence estimation and the soft sequence dependency according to the soft sequence probability and the soft sequence condition probability,introduces the problem to be solved and constructs the soft sequence dependency forecasting model based on text data and the sequence dependency soft set model.These models are used in constructing the forecasting method based on text data.The soft sequence dependency forecasting model builds the relationship between the soft sequence dependency and the soft set forecasting problem which is based on text data and the time lag effect.The implementation of the soft sequence dependency forecasting model relies on the feature soft set model and the sequence dependency soft set model.The feature soft set model accomplishes the task of transformation from text data to vector space model.The sequence dependency soft set model calculate the soft sequence estimation and complete forecasting.For problems of empty sets caused by the sequence dependency soft set model and superabundant features during forecasting,the paper introduces methods of searching approximate set and using heuristic algorithm.For evaluation on the soft sequence estimation,the paper defines three soft sequence estimation error metrics,that are the sequence error soft mapping,the sequence single error soft mapping and the sequence total error.In the example analysis section,the paper introduces the implementation of the soft sequence dependency forecasting method.In the application analysis section,the method is used to predict the share price volatility at(t-1)of ten companies from 8-K reports.The results indicate that soft sequence dependency forecasting method based on text data is able to support the soft set forecasting task based on text data and the time lag effect.
Keywords/Search Tags:soft set, soft dependency, soft sequence dependency, text data, forecasting
PDF Full Text Request
Related items