| With the acceleration of industrialization,air pollution has become a global problem.Air pollution poses a great threat to human health.Therefore,air pollution monitoring and prediction are particularly important.In recent years,to better monitor air pollution,many countries have established many air quality monitoring stations(standard stations)to achieve real-time air pollutant monitoring.These stations can provide accurate air pollutant concentration data,but the deployment of these stations requires high costs,which leads to their sparse distribution in spatial.With the development of sensor technology,a large number of low-cost,portable micro air monitoring sensor devices(micro stations)appear,and these low-cost sensors can be used in wireless sensor networks for air quality monitoring.Although lowcost sensors provide a more convenient way to monitor air pollutant concentrations,the pollutant concentrations outputted by these sensors are often inaccurate.It is worth noting that due to the limitation of the sensor monitoring principle,it is impossible to directly output the concentration value but output the electrical signal response.In theory,sensor response has a linear relationship with the concentrations.However,affected by complex factors,there is a complex nonlinear relationship between the sensor responses and the reference concentrations.Meanwhile,this type of monitoring equipment also faces many problems,such as cross interference and sensor aging.To achieve more reliable and effective air pollution monitoring,our goal is to calibrate low-cost air monitoring sensors.By monitoring air quality,people can understand the real-time pollution status and pollutant trends of the monitored area.By doing so,it is beneficial to air quality-related policymaking.In particular,it is necessary to monitor some areas that may produce high pollution emissions,which is conducive to the government’s timely emission control in such areas.However,real-time monitoring is not significantly helpful for people’s travel planning.Therefore,in recent years,the air pollution prediction task has also attracted much attentions.To better achieve the monitoring and prediction tasks of air pollution,this work investigates the data mining solutions to these tasks based on knowledge in the air quality field.The main contributions are listed below.1.To better achieve air pollution monitoring,a calibration method based on sequence modeling is proposed for air monitoring sensors(Deep CM).This work innovatively reformulate the point-to-point sensor calibration problem as a sequence-topoint calibration problem.This work not only alleviates the cross interference phenomenon but also overcomes the shortcomings of existing calibration methods that cannot mine relevant knowledge from historical time series.The experimental results show that this method has the best performance compared with the seven baseline methods.2.To alleviate the sensor drift phenomenon encountered in sensor calibration,a dual encoder network based on the information captured from micro and standard stations is proposed.The network also introduce a social-based guidance mechanism,which can not only dynamically introduce periodic features but also dynamically introduce adjacent similar features.To evaluate the performance of the proposed method,it is compared with nine baseline methods on two real datasets.Experimental results show that this method alleviates the drift phenomenon encountered in Deep CM and achieves the best performance.3.To reduce the consumption of computational resources of calibration models,a calibration method for air monitoring sensors based on multi-task learning is proposed.The method uses multi-task learning to model the interaction between different tasks,and achieves multiple sensor calibrations while improving performance.The method is evaluated on three real-world datasets and compared to multiple baseline methods.Experimental results show that the proposed multi-task calibration method can achieve the overall optimal performance.4.To better achieve the air pollutant prediction task,an air pollutant concentration prediction method guided by level information is proposed.On the one hand,the level information is more stable than the concentration information.This is because the level information is coarse-grained and not easily affected by noise.On the other hand,the level information is obtained by experts through domain knowledge analysis,and the introduction of level information can obtain potential domain knowledge.The method is applied to the setting of time series mining and the setting of spatio-temporal data mining.Experimental results show that the proposed method achieves the best performance under different settings. |