| Gaze tracking technology is a technique to calculate people’s visual attention by mechanical or optical means,and it’s the most direct and quantitative method to analyze people’s attention distribution and preferences.In recent years,gaze tracking technology has gradually become one of the hot research topics in the field of humancomputer interaction.It has been widely used in many fields such as human-computer interaction,visual attention analysis,advertising,psychology,medicine,and human energy efficiency research,and is closely related to people’s life.Therefore,it’s of great importance to study accurate and efficient gaze tracking systems.Since the 1970s,researchers have proposed a variety of methods to detect eye movement,such as Scleral Search Coils,Electro-oculography,corneal reflection method,etc.However,most of the existing line-of-sight tracking algorithms have the following problems:(1)the existing gaze tracking methods need multiple highresolution NIR cameras with NIR active light sources.The hardware setup of gaze tracking is complex,expensive,and limited in use.(2)the accuracy of gaze tracking algorithms depends heavily on the resolution of the acquired eye images,and the accuracy and stability of gaze tracking algorithms are severely reduced in the case of low-resolution eye images and occluded eyes.To address the problems and shortcomings of the above-mentioned gaze tracking algorithms,this dissertation develops various low-cost line-of-sight tracking systems for different application scenarios and proposes a line-of-sight tracking algorithm based on deep learning technology combined with personal calibration.The main research contents and innovation of this dissertation are as follows:(1)For the problems of small samples and manual labeling system errors in machine learning-based gaze tracking algorithms under visible light,this dissertation establishes the visible light virtual eye dataset(Unity Eyes dataset,23,011 images),high-resolution Asian infrared eye images and visible images matched dataset(IMVE dataset,78,252 images)and,the high-resolution Asian human sight tracking dataset(KLBS-eye dataset,15,350 images).An innovative labeling method of extracting eye features under infrared eye images as eye labels under visible images is proposed to improve labeling accuracy while avoiding manual labeling errors,providing a basis for subsequent establishment of accurate iris center localization algorithms and gaze tracking algorithms.(2)Fast,efficient,and accurate closed/opened eye detection is a prerequisite for exploring stable line-of-sight tracking technology.This dissertation combines different features extracted(original image,Local Binary Pattern,XY Projection,Histogram of Gradients)with different classifiers(Ridge Regression,Support Vector Machine,Cascade Classifier,Stacked Autoencoder and Convolutional Neural Network)and compares the performance between them on the ZJU eye closure dataset.The experimental results show that the autoencoder model based on the HOG feature achieves an accuracy of 94.75%,which is better than the comparison algorithms.The small and efficient ridge regression model can detect closed eye well and has the fastest computation speed among all algorithms.By comparing the properties and performance of these descriptors and classifiers,the most suitable algorithms can be found for different application scenarios.Finally,a driver fatigue detection system was established based on the ridge regression model blink detection algorithm.(3)There is a clear demarcation between the pupil and iris in the near-infrared light source,while there is no clear edge between the pupil and iris in the visible light environment.And the iris is also often obscured by the upper and lower eyelids resulting in inaccurate iris centering.To address this problem,this dissertation proposes a style conversion network-based iris center localization algorithm in visible light based on this dataset.The algorithm uses a modified Unet network structure to transform visible images into infrared eye images and generate thermal images to predict iris centers.The algorithm is compared with four classical algorithms on the IVME dataset and the I-SCIAL-DB dataset.Compared with other algorithms,the performance of the proposed algorithm within 5 pixels error reaches 75% and 84% on the IVME dataset and the I-SCIAL-DB dataset,which are both better than other methods(64% for IVME and 83% for I-SCIAL-DB).Finally,a model-based gaze tracking system under visible light is built based on the proposed iris center localization algorithm,which reduces the gaze tracking estimation error from 1.63° to 1.50°(8.7%)compared with other algorithms while allowing the subject’s head to move freely.(4)To solve the problem that model-based gaze tracking algorithms require additional infrared cameras and infrared light sources and are heavily dependent on image quality,this dissertation proposes a deep learning-based gaze tracking algorithm with personal calibration in visible light.The algorithm uses a residual neural network as the basis and a style transform technique to convert the virtual image to the style of the real image to solve the small sample problem of machine learning,and introduces personal parameter calibration,which allows the network to learn the individualized information of users.The experimental results show that the proposed algorithm can reduce the gaze tracking error to 2.55°,3.40° and 0.59° on the MPIIGaze dataset,UT-Multiview dataset and KLBS-eye dataset,outperformed existing methods in all three datasets(MPIIGaze: 2.76°,UT-Multiview 4.24°,KLBSeye 1.56°).In this dissertation,we propose an iris center localization algorithm and modelbased and appearance-based gaze tracking algorithms for visible light applications.The dissertation also establishes an iris center localization and gaze tracking database for Asian people,which provides the basis for the research of gaze tracking technology for Asian people in visible light. |