| Individual identification is the basis for the study of wildlife behaviour and ecology,and is an important basis for the development of conservation policies for rare wildlife.The Fran(?)ois’ langur(Trachypithecus francoisi),as a Class I priority protected wildlife in China,is a greater challenge to monitor its individuals over time.Individual identification is the basis for the study of wildlife behaviour and ecology,and is an important basis for the development of conservation policies for rare wildlife.The body of the Fran(?)ois’ langur is mostly black,with only a white stripe from the mouth to the base of the ear and around the back edge of the ear around the edge,making individual identification difficult,with low cases of identification in the field,and unable to identify individual Fran(?)ois’ langurs quickly and accurately.With the development of computer vision technology,it has become possible to apply computer vision techniques for the facial recognition of Fran(?)ois’ langur.Deep learning networks have been very successful in the fields of image processing and image classification,making it possible to achieve contactless and stress-free recognition of individual Fran(?)ois’ langur based on image biometrics.At the same time,the development of artificial intelligence technology provides technical support for the recognition of individual Fran(?)ois’ langurs’ faces,which can further help the development of Fran(?)ois’ langur-related research.To this end,this study collected data from November to December 2021 by manual field collection at the Fran(?)ois’ langur Rare Animal Reproduction Centre in Wuzhou,Guangxi,using mobile phone photography to collect a total of 5,540 images of 41 individual Fran(?)ois’ langur(Number of images per individual between 72 and 280).The production of the dataset included operations such as screening Fran(?)ois’ langur images,annotation of facial information,sample pre-processing and data enhancement.This study proposes a YOLOv5 s deep learning-based facial recognition method for Fran(?)ois’ langur and a FaceNet face recognition technology-based individual recognition model method for Fran(?)ois’ langurs,based on deep learning techniques,with the main research content and results as follows:(1)Detecting the Fran(?)ois’ langur facial region from the image is the first step in implementing image-based facial recognition of Fran(?)ois’ langur.Firstly,the Fran(?)ois’ langur images taken at the reproduction centre were classified,annotated and divided into training and test sets,then YOLOv3,YOLOv4,YOLOv5 s,SSD and Faster R-CNN Fran(?)ois’ langur face detection models were constructed respectively,and the processed datasets were put into different detection models for training,and the precision,mean average precision,recognition speed and model size of the obtained training models were analysed to determine the optimal Fran(?)ois’ langur face detection model.(2)The performance of different target detection algorithms varies considerably.The precision of the YOLOv3 model was 98.31%,the mean average precision was 92.41%,the model size was 235.00 MB,and the detection speed was 3.93 s/p;the precision of the YOLOv4 model was 98.38%,the mean average precision was 94.35%,the model size was251.00 MB,and the detection speed was 3.89 s/p;the precision of the YOLOv5 s model was 98.47%,the mean average precision was 99.97%,the model size was 14.00 MB,and the detection speed was 2.84 s/p;the precision of the SSD model was 97.73%,the mean average precision was 93.59%,the model size was 91.00 MB,and the detection speed was3.48 s/p;the precision of the Faster R-CNN model was 91.29%,the mean average precision was 97.93%,the model size was 111.00 MB,and the detection speed was 5.73s/p.YOLOv5 s compared to YOLOv3,YOLOv4,SSD and Faster R-CNN models,while the precision improved by 0.16%,0.09%,0.74%,7.18%,respectively;the m AP increased by 7.56,5.62,6.38 and 2.04 percental points,respectively;the model size decreased by94.04%,94.42%,84.62% and 87.39%,respectively,as well as the detection speed increased by 26.99%,27.74%,18.39% and 50.44%,respectively.The results show that YOLOv5 s outperforms the four comparison models,YOLOv3,YOLOv4,SSD and Faster R-CNN,in a total of four metrics: precision,mean average precision,recognition speed and model size.The YOLOv5 s model has the best overall performance and is able to detect the Fran(?)ois’ langur facial region in images accurately and in real time under caged conditions.(3)YOLOv5s-based facial detection model for Fran(?)ois’ langur is influenced by multiple factors.Under the same experimental conditions.The F1 score of the model for detecting Fran(?)ois’ langur images with dataset sizes of 500,1000,2000,3000 and 5540 was 0.67,0.72,0.76,0.83 and 0.86,respectively,and the higher the F1 Score,the better the model detected as the size of the training set increased.The precision of this model for frontal and profile face poses of Fran(?)ois’ langur faces were 79.30% and 61.10%,respectively,the recalls were 0.86 and 0.80,respectively,the F1 score were 0.82 and 0.73,respectively,as well as the mean average accuracy were 88.90% and 73.20%,respectively.The precision of this model for Fran(?)ois’ langur faces when photographed at distance and close range were 73.10% and 83.95%,respectively,the recalls were 0.80 and 0.85,respectively,as well as the mean average accuracy were 78.90% and 88.90%,respectively.The precision values of this model for detecting Fran(?)ois’ langur under f bright and dim conditions were 80.10% and 82.80%,respectively,the recalls were 0.80 and 0.83,respectively,the F1 score were 0.85 and 0.81,as well as the mean average accuracy were85.90% and 86.70%,respectively.Examining the effect of data enhancement on model performance,the precision of this model in detecting Fran(?)ois’ langur faces with and without data enhancement were 83.70% and 82.20%,respectively,the recalls were 0.90 and 0.83,respectively,as well as the mean average accuracy was 93.40% and 88.50%,respectively.(4)An individual recognition model for Fran(?)ois’ langur constructed using FaceNet face recognition technology showed that the Euclidean distance between the128-dimensional feature vectors of the two images tested is less than 1,it means that the two images are the same body.When the Euclidean distance is greater than 1,it means that it is not the same body.After training,the FaceNet-based individual recognition model for the Fran(?)ois’ langur had an individual recognition accuracy of 78% on the constructed Fran(?)ois’ langur validation dataset.Finally,the combination of the face detection model and the individual recognition model allows for automated detection and recognition of individual Fran(?)ois’ langur,thus providing automated technical support for long-term monitoring and behavioural analysis of Fran(?)ois’ langur and other research. |