Font Size: a A A

Study On Text Data Based Auxiliary Maintenance Method For On-board Equipment Of Train Control System

Posted on:2020-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:S WuFull Text:PDF
GTID:2392330578454917Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
The on-board equipment is one of the core devices of train control system in china,and its importance is self-evident.In actual operation,the failure frequency of on-board equipment is still high.However,existing methods of relying on the experience of maintenance personnel to carry out fault diagnosis of on-board equipment have low efficiency and long diagnosis time,which directly affect the efficiency of driving and even endanger the safety of driving.Therefore,the railway operation department urgently needs to explore the auxiliary maintenance methods which can improve the efficiency of fault diagnosis of on-board equipment.The on-site record of the fault phenomenon description texts(hereinafter referred to as "fault texts")contains a large number of information related to the fault category.Representing fault texts,and constructing a fault classification system of "fault text-fault category" to assist maintenance personnel in fault diagnosis is of great significance in improving fault diagnosis efficiency.However,due to the short length and the lack of a unified record format of fault texts,and the large difference in the number of texts of different fault categories,traditional vector space model feature representation method and classification algorithms cannot be applied to the construction of on-board equipment fault texts classifier.In order to solve the above problems,this paper proposes a fault text feature extraction method based on fusing word features and topic features.On this basis,a fault text classification system based on cost-sensitive Support Vector Machine(SVM)is constructed.The specific work is as follows:(1)Extracting word features of on-board equipment fault texts based on information gain.Firstly,we use vector space model to represent fault texts.To solve the problem of high dimensions and a large number of extraneous features of vector space model representation,this paper proposes a fault text word feature extraction method based on information gain and extracts features of vector space model representation of fault texts,which retains the features related to the classification task,and obtains the word feature representation of fault texts;(2)In view of the shortcoming of vector space model in short text feature representation,this paper proposes a topic feature extraction method based on multi-granularity Latent Dirichlet Allocation(LDA)model to expand the feature of the vector space model representation of fault texts by mapping the text feature from the word space to the topic space.In order to solve the problems that the LDA model is sensitive to the number of topics and the number of topics is difficult to determine in practical applications,and in order to obtain better topic features of fault texts,firstly,this paper uses the perplexity index to select LDA topic feature spaces to get a set of LDA topic feature spaces with different number of topics;After that,a topic feature space fusion algorithm based on improved Relief(Relevant Features)is proposed to fuse topic features on the topic feature spaces set,and we obtain the multi-granularity topic feature representation of fault texts;(3)By using a serial feature fusion strategy to fuse the word features and multi-granularity topic features of fault texts,the feature vector representation of fault texts is obtained;(4)To solve the problem that the distribution of fault text categories is not balanced,which results in poor classification effects on minority classes,this paper proposes a fault text classification model construction method based on cost-sensitive SVM.By adjusting the samples misclassification cost of different classes in the training process of SVM,the sample misclassification cost of minority classes is increased,and the sample misclassification cost of majority classes is reduced,so that the Support Vector Machine is cost-sensitive on different classes,which can improve the classification accuracy of samples of minority classes.Finally,the proposed feature extraction method and cost-sensitive SVM-based classifier construction method are compared with traditional methods.Compared with the traditional vector space model feature representation method,the feature extraction method proposed in this paper can effectively make up for the shortcoming of vector space model in short text feature representation and improve the accuracy of text classification.At the same time,compared to the traditional classifier model,the classifier model based on cost-sensitive SVM can effectively improve the classification accuracy of minority classes of fault texts.Experimental results show that the fault text classification model proposed in this paper can effectively assist maintenance personnel in the fault diagnosis of on-board equipment and improve the efficiency of fault diagnosis of on-board equipment.
Keywords/Search Tags:On-board Equipment, Auxiliary Maintenance, Information Gain, Multi-granularity LDA, Cost-sensitive SVM
PDF Full Text Request
Related items