Font Size: a A A

Research On Intelligent Detection Of Facial Acupoints

Posted on:2024-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:T T ZhangFull Text:PDF
GTID:1524307325950029Subject:Doctor of Engineering
Abstract/Summary:PDF Full Text Request
Recently,the research on Traditional Chinese Medicine(TCM)is greatly supported by national plans.Acupoint therapy(such as acupuncture,massage,etc.)robots have gradually become a hot research topic.As the core task of the acupoint therapy robots,intelligent acupoint detection is not only the fundamental of acupoint therapy but also the primary way to advance the modernization of the TCM,which has great potential for real-world research and applications.In this work,facial acupoint detection(FAD)is studied based on visual images in this work,aiming at detecting the categories and specific locations of acupoints on the human face.The FAD is able to assist the location of facial acupoints in an autonomous and efficient manner,which improves the efficiency of acupressure physicians and is also used for daily training for physicians.Existing FAD methods fail to meet the requirements of intelligent acupoint therapy,the deep learning technique is a promising way to advance the technical improvement for real-world applications.However,since the task specificities of the FAD,such as non-standardized annotations,dense distributions,and complicated implicit structural relationships,the deep learning-based acupoint detection approaches are still facing extensive technical challenges and difficulties.Currently,almost all works were conducted based on private datasets with only a small number and sparse acupoints,which cannot be applied to real-world acupoint therapy.Therefore,the first and fundamental work of the FAD study is the construction of the acupoint dataset.To this end,several experienced and professional acupressure physicians are invited to create a qualified dataset.Considering the spatial-dependent features the high-resolution neural network(HRNet)is proposed to achieve the FAD task.In addition,to extract discriminative features to support the FAD task under the smallsample problem,a resolution,channel,and spatial attention fusion-based HRNet and structure-consistency enhanced adversarial auto-encoder(AAE)are proposed to achieve the FAD task.Finally,a prototype is developed to test the proposed approaches.The primary works and contributions are summarized below:1.A multi-channel heatmap regression-based landmark detection approach is designed to achieve the FAD task.A qualified dense FAD dataset is constructed to support real applications with custom-defined evaluation metrics.The high-resolution deep neural architecture is dedicatedly studied to achieve the FAD task.Finally,the dataset,baselines,and benchmarking results can be obtained to enlighten future research works.In the TCM clinical diagnosis,the FAD requires the physicians to discriminate the certain acupoint type and perform the treatment at the center of the acupoint to ensure the therapy effects.Considering the task specificities of the acupoint classification and center localization,this work achieves the FAD task by a multi-channel heatmap regression-based landmark detection approach in an end-to-end manner with a single neural architecture.In the predicted heatmaps,each channel denotes a certain acupoint based on the pre-defined order index,and the most prominent activations of a heatmap represent the acupoint center.Since existing private and sparse acupoint datasets cannot be applied in real-world applications,a new dataset,called FAcupoint,is firstly constructed to support future research,in which a total of 43 acupoints are densely annotated for each sample.Several experienced and professional acupressure physicians are invited to clarify the annotation protocols and working procedures,collect the raw samples,and annotate the selected images,and finally,a total of 654 samples are obtained to achieve the FAD task.A total of three new measurements,including FA_NME,FA_FR,and FA_AUC,are defined to evaluate the model performance.By extensive investigation and analysis,the HRNet is proposed to achieve the FAD considering the spatial-dependent landmark detection task.The feature fusion of multi-resolution paths in the HRNet network is expected to address issues of the detail and semantic feature loss caused by common convolutional neural networks.To compare the task performance,several baselines with different technical frameworks are selected and adapted to achieve the FAD task.The experimental results on the new FAcupoint dataset demonstrate that the HRNet network harvests better performance over other baselines,achieving 2.9042%FA_NME,which provides the benchmarking results for future studies.2.After investigating the decision-making process of TCM physicians,a novel resolution,channel,and spatial attention-based fusion(RCSAF)module is proposed and incorporated into the HRNet to formulate the RCSfNet,which can adaptively learn key facial structures to effectively improve the FAD performance.In the TCM clinical diagnosis,the complicated and implicit structural constraints between face landmarks and acupoints are significant references to perform acupoint detection for physicians.For instance,the yintang acupoint is located at the center of the left and right eyebrow.However,in the HRNet,the raw sum-based feature fusion for different resolution paths may not be optimal for extracting the informative textural features and patches.To address this issue,a novel resolution,channel,and spatial attention-based fusion(RCSAF)module is proposed and incorporated into the HRNet to formulate the RCSfNet,which is designed to guide the model to learn informative features and patches to support the FAD task in a learnable manner.The resolution and channel attention paths are designed to fuse both the resolution-level and global features,while the channel and spatial attention paths are to create the multi-view contextual details and pixel-pixel structural relationships.Finally,the feature-tuning across all resolution paths is performed to learn the complicated and implicit structural constraints between face landmarks and acupoints,guiding the model to achieve the FAD task by imitating the decision procession of human experts by focusing on the facial landmarks(such as eyes,nose,etc.).Extensive experiments are conducted on the new FAcupoint dataset,and the proposed approach(with RCSAF module)achieves 2.4228%FA_NME,i.e.,about 17%relative performance improvement over the original HRNet baseline.The results also confirm that the proposed approach is able to achieve model convergence with higher performance.By visualizing the learned features of the HRNet network,it can be found that the proposed approach focuses on the heatmaps of the facial landmarks,clarifying the latent structural correlations between acupoints and facial landmarks.The results also support the motivation of the proposed approach,i.e.,achieving the FAD task by imitating the behaviors of human experts.3.Considering the capacity of representation learning of the self-supervised mechanism,a structure-consistency enhanced feature learning is proposed to extract the facial structural features based on an adversarial auto-encoder neural architecture.The facial and acupoint structural features are further fused to improve the FAD performance,which is also able to reduce the demand of annotated samples for deep learning models.Considering that annotating the FAD samples is highly expert-dependent,costly,and laborious work,it is hard to collect preferred-scale and high-quality training samples,which limits the model capacity to learn the latent structural correlations between acupoints and facial landmarks only by the supervised learning mechanism.To address this issue,a structure-consistency enhanced feature learning is proposed to achieve the FAD task based on an adversarial auto-encoder neural architecture(called FADbR),in which a multi-stage training strategy is performed to optimize the proposed model.An autoencoder architecture is designed to achieve self-supervised representation learning,which learns the facial textures and structural features by an image-to-image encode-reconstruct procedure.The adversarial learning mechanism is performed to guide the model to learn enhanced structural features by the structural consistency between the reconstructed image and the original image,the robustness of the learned compact low-dimensional abstract features is expected to be enhanced to support the FAD task.Based on the learned feature representations,the interleaved feature sharing layers are incorporated into the AAE architecture to fuse the facial features and acupoint features in multiple resolution levels,which benefits optimizing the structural features to support the FAD task in a supervised manner.Experimental results demonstrate that the proposed FADbR framework harvests great performance improvement,achieving 1.2866%FA_NME,and about 47%relative performance improvement over the RCSAF model,which can be applied to real-world practical TCM applications.Most importantly,thanks to the proposed structure-consistency enhanced feature learning,the proposed framework can achieve better performance over the HRNet baseline with only 100 training samples,which further confirms the efficiency and effectiveness of the learned structural representations of facial landmarks.In summary,facing the key problems and technical limitations,this work contributes to the FAD study for problem definition,dataset construction,method investigation,and prototype validation.A series of deep architecture-based models are designed to achieve the FAD task and the required prototype is also developed to validate the applicability.This work can provide a foundation for future applications from research to industrial practices,further reaching a modernized and intelligent TCM development with great practical significance.
Keywords/Search Tags:Traditional Chinese Medicine, Facial Acupoint Detection, Deep Neural Network, Attention Mechanism, Feature Representation
PDF Full Text Request
Related items