| Facial landmark localization is to locate the coordinates of landmarks with semantic meaning(such as facial organs).Many face image analysis technologies,such as face recognition,head pose estimation and 3D face reconstruction,need facial landmark localization as an intermediate step.With the development of facial landmark localization,the research object is not only limited to the frontal face images collected in a single simulation environment,but also includes the face images with various factors such as pose,occlusion,illumination and so on.Although the research of facial landmark localization has achieved fruitful results,these unconstrained samples are still a difficulty.There are many different architectures of facial landmark localization algorithms,of which two are most commonly used: cascade shape regression model and Gaussian heatmap regression network.The principle of cascaded shape regression model is to iteratively update the frontal and neutral initial face shape until the shape is close to the real face shape.Due to the limitation of the initial shape,such models have great regression difficulty when faced with large pose face images.Gaussian heatmap regression network selects the Gaussian distribution map of coordinate points as the regression target,which improves the spatial generalization ability of the model.However,because the correlation between landmarks has not been effectively expressed,when facing the face image affected by pose,occlusion and other factors,the prediction results may be scattered.Aiming at the problems of these two structures,we proposed improved schemes to improve the robustness of the algorithm to unconstrained conditions(such as pose).The main work of this paper is as follows:1.Research on the facial landmark localization algorithm based on fusion subspace and3 D fitting.Considering that the facial landmark localization method based on cascaded shape regression model will be limited by the initial shape,we proposed a two-layer cascaded regression model structure,in which the first layer is used to generate the roughly aligned initial shapes,and the second layer takes the output of the first layer as the initial shape to achieve more accurate facial landmark localization.The regression goal of the first layer is the salient face shape defined in this paper.Using the 3D fitting method proposed in this paper,the corresponding complete face shape can be generated.The reason for locating the salient face shape firstly is that reducing the number of landmarks can reduce the difficulty of regression to a certain extent.In order to further improve the prediction accuracy of salient face shape,we introduced a sample partition method based on fusion subspace,which divides the samples into multiple subsets according to the fusion features.The samples in the same subset have certain similarity in pose,and each subset is trained separately to improve the overall pose robustness.2.Research on the facial landmark localization algorithm combining shape constraints and Gaussian heatmap regression.Because Gaussian heatmap regression network is weak in expressing the correlation between facial landmarks,we proposed to introduce the attention map as the shape constraint into Gaussian heatmap regression network.The algorithm structure of this paper includes three parts: 3D model parameter prediction network,attention map generator and Gaussian heatmap regression network.Among them,the 3D model parameter prediction network takes the lightweight network Mobile Net-v3 as the backbone,and the goal is to predict the parameters of the 3D face model.We also proposed a new data synthesis method to improve the generalization ability of the model.According to the predicted 3D face model parameters,the 3D face shape preliminarily fitted with the face image can be reconstructed.The attention map generator projects the 3D face shape to the 2D plane,connects the landmarks to form the boundary line map,and then converts the boundary line map into attention map.Gaussian heatmap regression network takes HRNet as the backbone and the fusion results of the original image and attention map as the input to realize the coarse to fine facial landmark localization.In this paper,various ablation experiments and comparison experiments with other advanced algorithms are designed on 300 W,WFLW and COFW facial landmark localization datasets.A large number of experimental data show that the algorithm has high accuracy and robustness for facial landmark localization of large pose and unconstrained face images. |