Font Size: a A A

Study On Text Mining Based Fault Diagnosis Method For Vehicle On-board Equipment Of High Speed Railway

Posted on:2017-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:F WangFull Text:PDF
GTID:2272330482979423Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
In railway industry, vehicle on-board equipment (VOBE) of high speed railway is one of the core equipments of Chinese Train Control System and a key issue affecting the safety and efficiency of high speed railway. However, statistics from Wuhan-Guangzhou high speed railway show that the railway systems currently suffer from a high rate of failures in VOBE. Therefore, it is significantly important to carry out research on fault diagnosis for VOBE.With the operation of high-speed railway, a vast amount of text data is recorded in the forms of repair verbatim every year. They provide useful data from which the knowledge must be discovered for efficient fault diagnosis and handling of the similar cases in future. From repair verbatim data, text mining techniques can be used to establish the associations between fault terms and fault class such that these associations can be used to improve the precision of fault diagnosis.However, the task of automatic discovery of knowledge from the repair verbatim is a non-trivial exercise mainly due to the following reasons:1) High-dimension data. In maintenance documents, there are tens of thousands or even hundreds of thousands of distinct terms or tokens. After elimination of stop words and stemming, the set of features is still too large for many learning algorithms. A text mining based fault diagnosis methodology depends heavily on the particular choice of the features used by the classifier. Therefore, efficient feature selection methods are keys for improving scalability of text categorization and accuracy of fault diagnosis.2) Imbalanced fault classes distribution. In maintenance documents, the number of examples in one fault class (i.e., majority class) is significantly greater than that of the others (i.e., minority classes), which caused by the nature of complex components of VOBE and the diversity of their work environments. These imbalanced class distributions have posed a serious difficulty to most.3) Lacks in semantic information. Traditional feature selection methods lack semantic information heavily due to the nature of "bag of word" model.To solve the shortcoming described above, this paper proposed a bi-level feature extraction method for VOBE fault diagnosis, and based on which we hire SVM based classifier to identify two-level fault patterns for VOBE. Experiments show that the proposed model can improve the performance and reliability of fault diagnosis, especially for minor fault patterns.The innovations of the thesis are as follows:1) We propose an improved CHI method to extract term features. By deeply analyzing the imbanlanced data set, an improved CHI method is proposed for feature extraction at term level, in which the weights of exclusive features for each fault pattern are adjusted to balance their importance and we reselect common features by considering distribution distance between fault patterns.2) We proposed a prior LDA model to extract semantic features. First, a novel method is proposed to extract prior knowledge, which is integrated into the basic LDA model to extract semantic features. As a semi-supervised model, prior LDA can extract more meaningful semantic features than basic LDA, which is an unsupervised model.3) A hierarchical classification method based on Support Vector Machine (SVM) is proposed, which can reduce the complexity of fault diagnosis of VOBE and improve the performance of the classifier.
Keywords/Search Tags:High Speed Railway, Vehicle On-board Equipment, Fault Diagnosis, Improved CHI, Prior LDA, Feature Fusion, SVM
PDF Full Text Request
Related items