With the widespread use of escalators in public areas,accidents on escalators are occurring more frequently.The majority of accidents occur due to the abnormal behavior of passengers when riding escalators.Monitoring the status of escalators and the passengers is the main motive to avoid accidents.However,the high cost and low efficiency of manual guarding lead to the impossibility of large-scale applications.Recently,the applications of intelligent video monitoring systems to escalator safety management have gradually become possible due to the rapid development of artificial intelligence and embedded technology.However,the escalator situation is still variable and complex,which requires efficient software algorithms and robust hardware platforms to ensure smooth operation of all functions and deliver alert messages timely.The existing intelligent video surveillance system is limited in hardware as well as in software performance.Therefore,it is not suitable for direct application to escalator situations.This paper proposed a series of improved algorithms based on intensive studies and analysis of the key technologies in the existing intelligent video surveillance system.The main motivation of this research work is as follows:(1)To address the issue of the high training time cost of the existing detection algorithm based on Adaboost.This paper proposes a detection algorithm based on fast training Adaboost to achieve rapid deployment in different complex situations.In this paper,according to the characteristics of sample weight distribution in the training process of the Adaboost algorithm,the adaptive weight trimming rule is designed to filter out samples with important influence on classifier generation,which optimizes the number of samples involved in training,and achieves a significant acceleration of the training process while ensuring the accuracy of the algorithm.In addition,this paper constructs a pedestrian detection dataset(POE2018)to perform more accurate pedestrian detection in escalator situations.(2)To overcome the issue of performance degradation of the existing multi-target pedestrian tracking algorithms in complex scenes,this paper proposes a multi-target tracking algorithm based on a multi-stream attention siamese network(MT-MSASiam).In this paper,a super-resolution module is constructed to improve the resolution of the target template to deal with feature insignificance caused by low video resolution.In addition,a data augmentation module is constructed to increase the feature diversity of the target templates.After that,three backbone networks are utilized to extract the features of the original target template,superresolution target template,and data-augmentation target template respectively.And then perform feature fusion to enhance the feature characterization capability.Simultaneously,the channel attention module and the spatial attention module are applied to the backbone network to further enhance the feature extraction capability.After that,the region generation network(RPN)is used to obtain the scale and position information of the tracking target in the subsequent video frames according to the fused feature map and the feature map of the region to be searched.Finally,a multi-target confidence matching rule is designed to solve the multi-target matching problem in video frame sequences,which ensures that the algorithm performs the multi-target tracking task stably and efficiently under various complex tracking scenes.(3)To resolve the problems of insufficient generalization of the adjacency matrix and inefficiency of attention module in existing graph convolutional behavior recognition algorithms,this paper proposes a two-stream adaptive spatial-temporal attention graph convolution network(2S-ASTAGCN).In this paper,an adaptive topology graph is designed that adaptively optimizes the graph topology connection relationship with the training process.And the adaptive connection parameters are utilized to balance the relationship between the adaptive topology graph and the original topology graph of the graph data,which enhances the generalizability of the model while ensuring the recognition performance.Simultaneously,a data-based spatial and temporal attention module is designed to focus the model on nodes and frames that are more valuable for recognition.Finally,the skeleton information is generated according to the joint information in the skeleton graph data.Then a two-stream framework is designed to combine the recognition results obtained from the joint information and the skeleton information,which further improves the recognition performance of the overall model.(4)To cope with the problem that the existing image data augmentation algorithms cannot be applied to non-Euclidean structured graph data,this paper proposes Graph data augmentation based on adaptive graph convolution(AGCN-GDA).In this paper,the adaptive graph convolutional network model is used for graph data recognition,and then the adaptive adjacency matrix is utilized to augment the graph data,which improves the performance and generalizability of the graph data augmentation method.To further overcome the abnormal behavior recognition of passengers on escalators,this paper collects a large number of abnormal behavior videos on escalator scenes.Then,extracts human joint information to generate skeleton graph data to establish the skeleton graph of human action on the escalator dataset(SAE2020).Finally,AGCN-GDA is applied to obtain the skeleton graph of human action on the escalator dataset with augmentation dataset(SAE2020-A).(5)To address the issue of insufficient utilization of spatial-temporal information in existing graph convolution behavior recognition algorithms,this paper proposes a multi-stream adaptive spatial-temporal attention 3D graph convolution network(MS-ASTAGCN-3D).In this paper,the adaptive high-order adjacency matrix is designed to optimally aggregates multiscale neighborhood features by adaptively optimizing the weights of different orders in the high-order adjacency matrix during the training process.Then,a multi-scale attention module is designed to further enhance performance and efficiency by aggregating local information and global information.Finally,the corresponding higher-order information is generated according to joint information and skeletal information.And a novel multi-stream framework is designed to fuse the recognition results of multimodal data streams,which makes full use of the complementary effects between multimodal data streams and improves the recognition performance up to the mark.(6)To address the limitations of existing intelligent video surveillance systems,this paper integrates and transposes the aforementioned proposed algorithms.Besides,an intelligent video surveillance system for escalators is developed on the NVIDIA Jetson Xavier platform.The system accomplishes pedestrian detection,pedestrian tracking,and pedestrian abnormal behavior recognition on escalators,hence achieves excellent comprehensive performance in complex escalator scenes. |