The monitoring technology of structure is an important part of the construction industry. The sensor network of the monitoring system is responsible for passing the response data to the gathering sensor node, and then the structure response data is passed to the base station by connecting to the Internet in the gathering node. The normal operation of the sensor network ensures the structural response data can be smoothly transfered, but in practical application, the sensor network often fall in the abnormal situation due to the reasons such as the sensor node losing efficacy, network system is invaded, or unstable network topology structure and so on. So it is very important to develop the suitable anomaly detection system for sensor network in time.At present, the system of anomaly detection still exists many defects. For example, the data slightly deviated from normal track is often considered as anomaly data by anomaly detection systems, which leads to the problem of high false; there is no unified scientific basis for the selection of the threshold selection method when the abnormal data is defined and so on. The purpose of this study is to put forward an effective anomaly detection algorithm with the ideal detection rate and the rate of false positives in the sensor network.On the basis of analysis and research on the support vector machine(SVM) algorithm, this paper points out the defects that complexity is higher when processing mass data sets with the traditional support vector machine(SVM) based on two kinds of classifiers. Combined the optimized K-means clustering algorithm, that was optimized in this paper, a new anomaly detection algorithm based on block support vector machine(SVM) was putted forward. And before training data, the two-way choise of data?s features was used to select the features of the data, wiping off a part of redundant attributes. In view of the detects of traditional K-means clustering algorithm that is sensitive to the choice of initial centers, the method of randomly selected K value is fluctuant and applying traditional Euclidean distance formula to high-dimensional data leads to the imprecise results, three improvement were puted forward, the selection method of initial center based on the density, using the K-R curve to choose a K value, and adding attributes weights into the euclidean distance formula to improve the traditional K-means clustering algorithm. Matlab was used to program for the algorithm of this paper, KDD CUP 99 data set was used as the test data to validate the applicability of the algorithm. The experimental results showed that the improved algorithm in this paper obtained higher detection rate and lower false alarm rate compared with the traditional support vector, and it could reduce the testing time of large-scale data and provide an effective method for the anomaly detection of sensor network and the processing of large-scale data. |