By the National Natural Science Foundation of China key projects as support, and the Huaibei Coalmine Group Corp specific project as the actual background, this dissertation extracts the features from gas times series, and completes its pattern recognition, then achieves early-warning of gas accidents by using data mining methods, which has both important theoretical significance and important practical application significance for China coalmines to improve the current grim situation of their safe production.The main research contents of this dissertation include the following areas: the data acquisition and preprocessing of coalmine gas times series; testing for stationarity, linearity and gaussianity of gas time series in coalmine; data dimension reduction and feature extraction of gas time series based on KPCA/KICA; the pattern recognition and its early-warning application of gas time series based on SVM.Based on the development practice of gas early-warning system in Huaibei Coalmine Group Corp, this dissertation discourses the system's structure. A method of data coding, which gives priority to the specific gas data, is put forward, and also is represented by the data structure of C language. Using predictive coding and run-length coding idea, the dissertation presents a remarkable data compression method for gas time series data of safety monitoring system, which is proved that the compressed gas time series can express the information of original gas time series. Some typical gas time series and their wave diagrams after cleaning are studied and exposited. Two methods of gas data processing model with priority/non-priority are brought forward based on queuing theory, the theoretical analysis results and the Matlab simulation experiment results simultaneously show that: the average occupation time for gas data processing by using the queuing system model with priority is nearly 1/30 of that of the model without priority.This dissertation introduces the unit root test methods which are the hotspot in econometrics into the testing for stationarity of gas time series. The method—ADF or PP is used to test the stationarity of gas time series separately. The results simultaneously show that: gas time series only in the normal gas situation is stationary, while it is non-stationary in the following situations, such as outburst, stopping ventilation, cutting coal/ blasting, and gas sensor calibration etc. The Hinich test algorithm, which can simultaneously test the linearity and the Gaussian of time series based on bispectrum and bicoherence coefficient, is used to test gas time series. The results indicate that gas time series is non-linear and non-Gaussian, no matter the situation is normal, outburst, stopping ventilation, cutting coal/ blasting, or gas sensor calibration.The dissertation expounds the abstract framework of kernel method and the mathematical foundation concerned. Based on set theory, metric space theory, operator theory, matrix theory and kernel method, time series similarity is uniformly defined by the distance derived from the vector (matrix) norm, no matter time series is univariate or multivariate, and linear or non-linear. It is proved that this definition in original space is equivalent to that in transformed space, which theoretically realizes the continuation and the unification of definition of time series similarity. The theory and the algorithm on data dimension reduction and feature extraction of multivariate time series based on KPCA/KICA is researched, and the mathematical simulation for synthetic data using Matlab is given. The experiments prove that they are all better than respective traditional PCA/ICA. The dissertation firstly puts forward a method to extract the features of gas time series from the multivariate time series (MTS) which is composed by twenty-four statistic parameters used to test for stationarity, linearity and gaussianity of gas time series. The results show that: KPCA only needs two principal components to express very clearly that gas time series has five different classification distributions; the features of this five classifications is also very notably expressed in the three independent components separated by KICA. All of this explain the validity of KPCA/KICA on the data dimension and feature extraction of gas time series.Based on One Versus All Coding, One Versus One Coding, Error Correcting Output Code, and Minimum Output Coding respectively, the dissertation uses least squares support vector (LS-SVM) binary classifier to construct multi-classifiers. They are used to classify the synthetic three-spiral data. The result shows that: by adjusting the Gaussian RBF kernel function parametersγ,2σ2, for example, whenγ=10, 2σ2=0.01, the classification precision of MOC classifier can achieve 100%. On the other hand, they are used to classify gas time series, based on the extracted features by using KPCA and KICA. The result shows that: by adjusting the Gaussian RBF kernel function parametersγ,2σ2, for example, whenγ=1, 2σ2=0.3125, the classification precision of MOC classifier can also achieve 100%.The application effects of research achievements are also introduced in gas early-warning system in Huaibei Coalmine Group Corp.The dissertation has 53 figures, 16 tables and 218 references. |