| With the advancement of technology and the decrease in hardware costs,cameras have become ubiquitous in streets and alleys.,and video data is growing explosively.Abnormal behavior is defined in comparison to normal behavior,and it has characteristics such as suddenness and low frequency,so traditional manual inspection is no longer applicable.With the development of deep learning,algorithm models automatically learn relevant features and monitor video data at all times.When abnormal behavior occurs,the model automatically makes judgments and triggers corresponding warning messages.Defining falling as abnormal behavior is determined by the situation and related environment.For example,the falling of personnel in the computer room and the sudden falling of elderly people living alone.Compared with falling behavior in daily life,the above situations are more likely to cause irreparable consequences.Therefore,timely detection of falling behavior and response is of great significance.This thesis proposes a new idea for fall detection based on the temporal characteristic of falling and combined with object detection networks.The main work of this thesis is as follows:1.To meet the demand for model lightweighting,this thesis takes the YoloXDark Net53 model as the basis and proposes an improved L-YoloX model by reducing the model parameter and computational complexity and enhancing the model’s feature extraction capability.Specifically,adding attention mechanisms to the CBL and FPN structures enhances the capability of the backbone network and improves the ability of the network to fuse multi-scale features.,and improve the network’s multi-scale feature fusion ability.Introduce depthwise separable convolutions in the stacked CBL structure to reduce network volume and parameter count.The L-YoloX algorithm ultimately reduced the parameter quantity by 3.3% and decreased inference time by 2.9%,achieving the goal of lightweighting.2.Based on the temporal nature of falling behavior,the ST-GCN network has been improved to enhance the network’s classification ability for human skeletal keypoints.Specifically,the mask and adjacency matrix structures in the network have been improved.In the mask,a self-attention mechanism is added,enabling the mask to combine with the connections between other keypoints,highlighting the importance of different parts.In the adjacency matrix,an additional learnable matrix is added,which is added to the original result to obtain information between non-connected skeletal keypoints in space.Finally,the precision of the ST-GCN network is improved.At the same time,the number of skeletal keypoints information has been improved in experiments,which improves the model’s inference speed.3.To meet the requirements of model deployment,the related work of model quantization was studied based on the Tensro RT framework.Specifically,the network layer fusion,tensor fusion,and data precision quantization were investigated.The possibility of deploying the model on embedded platforms was verified through experiments.In the end,with a slight decrease in accuracy as a cost,the model’s inference speed was improved. |