Font Size: a A A

Forecast For The Train Ticket Data Analysis And Feature Extraction Methods

Posted on:2005-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LvFull Text:PDF
GTID:2208360125457901Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of information technology in China railway, rich ticket data have been collected in China Railway Train Ticket System (CRTTS), which is the subsystem of China Railway information system. How to efficiently extract the valuable decision information from the huge ticket data sea with the lower human and technique expenditure is becoming the urgent request for the decision department of Railway and has been the key point for the information department of Railway. It is the techniques about data mining developed rapidly that establish the stable theoretical footstone for the further research on the railway ticketing analysis, but there are some limitations existed in present data mining methods when they are applied to the huge datasets with the railway background. So, the generic methods must be improved to fit the application needs.Regarding the railway passenger traffic as our study background and analyzing around the train ticketing requirements, we do deeply research and make lots of application experiments on how to build the efficient data analysis model on ticket dataset in CRTTS. The methods of Decision Tree Induction and Concept Description in data mining are the theoretical point which we begin our study, and this research aims at building rational and efficient models to analyze train datasets. Firstly, after detailedly, deeply analyzed and studied on current classification algorithms, especially, such as on ID3, SLIQ, SPRINT, and according to the requirements of decision analyses and the limitations of current prediction methods in CRTTS, a new method TTDTPA, which is based on decision tree induction, is presented. TTDTPA has the characteristic to break the memory restriction, can extract a kind of instructive rules that collect the advantages both prediction and statistic, and is fascile to implement the parallel algorithm. Therefore it is suitable for supporting multi-level requirements of the decision-makers for predictive analysis in CRTTS. Secondly, for improving the integrated analysis, this research also try to take other two data analysis methods to analyze the train ticket data. One is the naive bayesian, and the other is a new method based on the indiscernibility relation. The application experiments had proved that the latter method has efficient ability to extract the data characteristic of the minority kinds in main class, which just in time to make up the TTDTPA's limitation on this side. And then according to the induction analysis based on these methods and considering the application background, the instructive method that is used to building the analysis model on the train ticket data is been given at the end part of this paper.This study makes an efficient exploration in the application fields of data mining techniques and provides a favorable groundwork to make further researches on data analysis in CRTTS. And the improved methods have the ability to build an efficient predictive model to help decision maker to know the railway transportation situations well, get the multi-aspect, multi-level analyses for train ticket data.
Keywords/Search Tags:data mining, descriptive mining tasks, predictive mining tasks, decision tree induction, rough set, train ticket analysis, train traffic
PDF Full Text Request
Related items