| Since entering the 21st century,accompanied by rapid social progress,the industry of short videos and others has experienced swift development.Nowadays,videos have become an integral part of our daily lives,work,entertainment,and learning.However,extracting specific information segments from videos remains a laborious task,requiring considerable effort.The primary purpose of facial recognition-based video editing technology is to combine computer vision techniques with relevant video files,enabling the automatic extraction of target individuals from videos.Currently,the popular deep learning frameworks for facial recognition and detection are Retinaface and Facenet.This article improves and adjusts the network structures of the Retinaface facial detection network and the Face Net facial recognition network,achieving the functionality of extracting and integrating segments featuring specific individuals from videos.By combining facial recognition with video data,we can save the time and effort required to manually search for these crucial video segments.The system is applied to video files and primarily addresses the following issues:(1)Selection and improvement of the facial detection algorithm: Facial detection is a crucial step in the facial recognition process.In this study,the Retinaface algorithm is chosen as the facial detection algorithm.The selection of the backbone network significantly impacts the size of the algorithm model.This study conducts comparative analysis on different backbone networks.Additionally,to enhance the speed of facial detection,improvements are made to the original FPN feature pyramid in the algorithm.(2)Selection of the facial recognition algorithm and the utilization of lightweight neural networks: The Facenet algorithm is chosen as the facial recognition algorithm in this study.After facial detection,the detected faces are normalized.The normalized face images are then input into the Facenet network model for further feature extraction.Through experiments comparing the size and accuracy of three backbone networks(Mobilenet V1,Mobilenet V3,and Inception-Res Net V1)in the Facenet network model,the lightweight neural network Mobilenet V3 is selected as the backbone network for the Facenet facial recognition algorithm.To enhance the network’s feature extraction capability by strengthening the receptive field of the recognition images,the RFB structure is incorporated into the Facenet facial recognition backbone network.(3)Integration of facial detection and recognition: The system experiments with the functionality of segmenting specific individuals’ appearances in 10 video segments.The Dilb model’s facial recognition is also used for comparison experiments in these 10 video segments.The experimental results indicate the feasibility of video clipping technology combining Retinaface and Facenet. |