Font Size: a A A

Research On The Visual Attention Mechanism For 3D Scenes

Posted on:2023-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z L WenFull Text:PDF
GTID:2558306914957739Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Today is the era of information technology,and digital media has become the carrier of people’s daily exchange of information.Compared with the traditional methods such as text and speech,more image information is now flooding our lives.Visual perception is an important physiological function of human interaction with the outside world,and the human visual attention mechanism is a way for the human brain to effectively process a large amount of information in complex scenes,which helps the human brain to efficiently understand the scene.Extracting salient regions by simulating the attention mechanism of the human eye has become a current research hotspot.3D display technology provides viewers with a viewing experience that is more in line with the real world,and research on visual attention mechanisms can help display technology simulate human vision for reasonable resource allocation.However,most of the current research on visual attention characteristics is based on 2D images,and little is known about 3D scenes.There are still challenges in how to accurately predict human eye gaze in 3D scenes.In view of the above problems,this paper studies the visual attention mechanism for 3D scenes.The main contents and innovations are as follows:(1)Aiming at the problem that the visual attention model lacks effective fusion of depth features in 3D static scenes,an eye movement dataset of static scenes is constructed and a bottom-up multi-scale depth-enhanced fusion visual attention prediction method is proposed.The static dataset includes 1075(NUS:600,NCTU:475)multi-scene images and corresponding fixation density maps.The model utilizes a deep feature enhancement module to perform fusion predictions at multiple scales.Quantitative evaluation of prediction results(CC:0.807,AUC:0.889,NSS:2.433,SIM:0.697,KL:0.751)compared with several other prediction algorithms,the first four indicators are the best,and the average improvement is higher than the second place 8.9%,achieving advanced results in multi-object scenes.(2)Aiming at the lack of a visual attention mechanism model in dynamic 3D scenes,the exploration of dynamic vision strategies in 3D scenes is carried out,including the construction of dynamic eye movement datasets and the corresponding prediction model of inter-frame spatiotemporal feature fusion.The dynamic experimental material was shot and synthesized by a two-view camera,including 3 ball throwing videos with a duration of 4-5 minutes.The inter-frame spatiotemporal feature fusion prediction model of the two viewpoints uses the spatial features enhanced by the disparity map of the two viewpoints to further learn the inter-frame correlation to obtain motion information,and output the final prediction map through the spatiotemporal fusion module.The quantitative results of the model(CC:0.807,AUC:0.889,NSS:2.433,SIM:0.697,KL:0.751)have great advantages compared with other models,the optimal index is 9%higher than the second place,and the average improvement 5.76%.Qualitative results also fully demonstrate the predictive advantage of motion-changing objects.
Keywords/Search Tags:visual attention mechanism, 3D scene, convolutional neural network, saliency prediction
PDF Full Text Request
Related items