| The prediction of multivariate time series has been a popular research area that has been widely applied and studied in various domains,such as financial stock price prediction,weather forecasting,and traffic flow prediction.Over time,time series prediction methods have evolved through three stages,starting from classical statistical methods that relied on data stationarity and normality assumptions,to traditional machine learning techniques that require manual feature engineering,and now to deep learning methods that use a vast amount of data to drive the learning process.Deep learning leverages neural networks to automatically capture nonlinear and dynamic relationships in time series data,resulting in advantages such as improved prediction accuracy,inference efficiency,and scalability.However,deep learning models are often plagued by issues such as large model size and complex structure.In this study,we propose a Multivariate Temporal Convolutional Attention Network(MTCAN)based on selfattention networks and Temporal Convolutional Networks(TCN)to simplify the model’s structure for multivariate time series prediction,and introduce pruning methods to optimize the model size.The specific research is as follows:(1)When establishing the MTCAN model,one of the primary issue faced by RNN-based deep learning models is the problem of gradient explosion or vanishing.Introducing other structures to solve this problem will lead to increased model complexity.Therefore,we built the MTCAN model is based on CNN,which does not suffer from the problem of gradient explosion or vanishing.To address the problem that convolutional neural networks are limited by the kernel size and cannot capture long-term dependencies,a one-dimensional dilated convolution is used to construct the residual block structure to flexibly expand the receptive field.To simplify the complexity of multivariate time series data,it is split into multiple univariate time series,and then each univariate time series feature is extracted.Self-attention mechanism is used to enhance the capture of self-correlation in time series data and reduce the number of model parameters.Finally,the output data of each univariate time series is merged and the final prediction value is obtained through a fully connected layer.Experimental results show that the prediction accuracy and generalization ability of this model are better than those of LSTM,GRU,Conv LSTM,and TCN models.The ablation experiments show that the selfattention layer saves about 75% of the parameter amount and also improves the accuracy and generalization of the prediction results to a certain extent.(2)Based on the MTCAN model,further pruning operations are used to optimize the model size.Although deep learning models are widely used in various fields,they are often limited by the availability of hardware resources and cannot be deployed in a large number of embedded or edge devices.To further optimize the MTCAN model,weight pruning is used to reduce the model size on top of the self-attention mechanism.In this study,weight pruning experiments were conducted on the MTCAN model,and the results showed that compressing the MTCAN model can significantly reduce its size without affecting its performance. |