| In prison security systems,detection and warning of violence such as fights and assaults is indispensable.Although numerous surveillance cameras have been deployed,the traditional monitoring method that relies on manual screening is not efficient enough to meet the real-time needs of modern prisons.As the continuous advancement of intelligent prison construction,it is becoming a new trend to perform detection and warning of personnel and abnormal behaviors using technological means like machine vision and artificial intelligence in the field of monitoring security.However,due to the constraints of environmental conditions,hardware configuration,algorithm complexity and many other factors,how to quickly and accurately implement violence detection is still a challenging research topic.Based on deep learning method and with the goal of real-time application,this thesis investigates lightweight person detection model and violence detection model,and explores feasible solutions for real-time violence detection and warning systems.The research contents and contributions of the thesis are as follows:1.A lightweight person detection model,named YOLO_Rep,is constructed by modifying and optimizing the YOLOv5 object detection network.Firstly,a hybrid person detection dataset is made based on public datasets for use in model training.Secondly,a new network YOLO_Rep is constructed by modifying the YOLOv5 object detection network,where reparameterization and upsampling module replacing are used.In order to reduce the number of parameters and boost the inference speed further,a model compression process based on channel pruning and inference acceleration is performed on the YOLO_Rep network.Experimental results show that the YOLO_Rep model with file size of only 11M achieves an accuracy of 79.1%for person detection on the mixed dataset,and its inference speed is up to 124.6 FPS.2.A lightweight network architecture called Dual-Channel Improved ShuffleNet(DCISN)is proposed for violence detection.The architecture draws on the parallel dual-channel idea of SlowFast network,and therefore extracts spatio-temporal information features using fast and slow channels in parallel.ShuffleNet units are redesigned to build lightweight Stage modules.In newly designed Stage modules,cascaded depthwise convolution layers and Squeeze-andExcitation(SE)block are employed to solve the problem of inter-channel information noninteroperability through the weighting mechanism.Thus,computation cost can be reduced effectively and meanwhile good accuracy can be ensured.To improve the accuracy of the network further,cross-stage connections are introduced in the architectural design to reuse fast action information.Experiments have been carried out on three recognized benchmark datasets,namely Hockey Fight,Movies Fight and RWF-2000.Numerical results show that the detection accuracy of the DCISN model is close to the best reported in the literature.Notably,the DCISN model has a very small number,0.168M,of parameters,very low computational cost of 0.253 GFLOPs,and a very high forward inference speed,close to 120 FPS.3.Based on the lightweight person detection model and violence detection model constructed,a prototype violence detection system has been designed and developed,and deployed for experiments.A streaming media server is deployed for the purpose of simulating the process of pushing and pulling video streams.A violence detection process for practical scenarios is designed by using the person detection algorithm and the violence detection algorithm in tandem.According to the requirements of practical monitoring system,some functions are designed and demonstrated on the Web side. |