Font Size: a A A

Research On The Data Cleansing Methods For Bridge Monitoring Data Based On Big-Data Platform

Posted on:2020-12-28Degree:MasterType:Thesis
Country:ChinaCandidate:P RenFull Text:PDF
GTID:2392330620456249Subject:Civil engineering
Abstract/Summary:PDF Full Text Request
With the construction of major bridge infrastructures in China,the installation technology and research concerning bridge health monitoring system are becoming more and more mature.However,former research mainly focused on the damage identification and early-warning of bridge structures.There is few investigation about the data anomaly of monitoring data and the corresponding data cleansing methods,which is significant for the further evaluation of service performance of bridge structures.The effectiveness and feasibility of proposed methods are compared and analyzed in this dissertation according to different types of data which is collected from the high-speed railway bridge –Nanjing Dashengguan Bridge and the median-span highway bridge –Lieshihe Bridge.In addition,to ensure the instantaneity of data-cleansing,data-analyzing and early-warning,a big-data platform is established in this dissertation.The goal of processing data in seconds can be achieved by using the highly efficient parallel computation ability and excellent fault-tolerance ability of distributed file system.The main research contents are as follows.l A novel bridge health monitoring system based on the big-data technology is established,which excels in high reliability,high usability,high storage effiency and high expansibility.After comparing advantages and disadvantages of various big-data technologies and methods,HDFS(Hadoop Distributed File System)is selected for data storage and Spark is selected for data modeling when processing offline data.What's more,Kafka is chosen for data cache and Spark-streaming is chosen for data reading and data processing when dealing with real-time data.Finally,through the analysis of data experiment,the established big-data platform is superior in the off-line computation performance,real-time online performance,expansibility and fault-tolerance ability.l The optimal data cleansing methods are proposed for data noise,outliers and data drift.Taking different types of data into consideration,stable singal data(e.g.temperature data)and unstable singal data(e.g.dynamic strain data)are investigated separately.Firstly,only stable signal data such as temperature needs to be noise-cleasned while unstable signal data does not have to.For stable signal data,Super Smoother Algorithm has stable denoising effect,but has a bad effect of over-cleansing original data details.Moving Average Filter is only suitable for simple noise and not adaptive for complicated noise data.Wavelet Decomposition Method is most excellent for data denoising due to the flexible setting of wavelet decomposition layers according to different sensor data,which makes the denoised data closest to the original data.Secondly,outliers are necessary to be removed in both stable signal and unstable signal.For temperature data,Interval Estimation Method can only indentify simple outliers.Although the improved Interval Estimation Method has a better performance in identifying outliers,it tends to recognize many wrong outliers.The Generalized Three-Delta Rule with Super Smoother Algorithm identifies outliers accurately and is most adaptive to removing outliers in stable signals.The Generalized Three-Delta Rule with Wavelet Method also works well in identifying outliers,but is preferable to be used in unstable signals because of its stricter thresholds.Hampel filter cannot identify all kinds of outliers with the fixed window value.For unstable signal data,a method which combines applying the relationship between sensor data and the Generalized Three-Delta Rule with Wavelet Method is proposed,which presents a good cleansing result.Thirdly,two methods are proposed to deal with data drift of stable signal data,which are separately based on data correlation and based on data difference.However,unstable data is unnecessary to be preprocessed in the area of data drift owning to its rareness.For stable signals,two methods are combined to cleanse data drift situation and have a good cleansing effect.? The intact and universal data recovery methods are proposed for different types of data.The target of data recovery is to ensure over 90 percentage of data recovery accuracy,and it is unnecessary to recover those mass-missing data.For stable signal data,when data loss ratio is less than 15%,one-dimensional time-history interpolation method is better.When data loss ratio is between 15% and 20%,it prefers to apply one-dimensional time-history fitting method.If the data loss ratio is over 20%,this specific data should not be used.For unstable signal,the data recovery method based on correlation of different sensor data is proposed.If the data loss ratio is no more than 20%,the recovery result of this method is acceptable in the engineering practice.However,if the data loss ratio is over 20%,it is suggested not to use these data any more.
Keywords/Search Tags:data cleansing, bridge health monitoring, big-data platform, data recovery
PDF Full Text Request
Related items