Real-time Monocular Depth Estimation Based On AIoT Edge Device

Posted on:2024-06-22

Degree:Master

Type:Thesis

Country:China

Candidate:X H Liu

Full Text:PDF

GTID:2568307067973739

Subject:New Generation Electronic Information Technology (including quantum technology, etc.) (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Depth estimation is an important part of computer understanding of 3D scenes and has a wide range of applications in artificial intelligence Io T(AIo T).Io T devices use images to estimate the depth information of a scene to enable tasks such as device navigation,path planning and reality augmentation.Implementing low-power,real-time depth estimation on edge devices is relevant for extending AIo T applications.However,most of the current depth estimation algorithms seek to obtain highly accurate depth information,which is very demanding on the device hardware.To achieve high performance depth estimation on edge devices,this paper proposes a low latency decoder and encoder,and a good trade-off between real-time and accuracy of monocular depth estimation algorithms by fusing different cue information and global information.The main innovations are as follows:(1)A real-time monocular depth estimation for fusing object structure information under a single pipeline is proposed.To address the problem of few real data for depth estimation tasks,the model enables the network to extract structural information of objects in different styles of images through a style translation technique.The model can successfully predict depth in real scenes even when trained using only virtual datasets.The model is also able to improve depth estimation performance by guiding the network to generate depth features through multi-scale structural information and global attention.(2)A real-time monocular depth estimation merging global features is proposed.In this paper,a hardware-friendly low-latency encoder and decoder are proposed for artificial intelligence edge devices.Together with this encoder and decoder,this model is able to achieve real-time inference on edge devices.Meanwhile,to solve the problem that convolutional networks rely too much on local features,this paper proposes the Transformer Conv module combining convolution and Transformer.The model improves the accuracy of the algorithm’s depth prediction by fusing global and local features.The RMSE of proposed depth estimation exceeded existing low-latency monocular depth estimation methods,reaching 0.554.(3)To evaluate the effectiveness of the proposed network,experimental evaluations were conducted on commonly used indoor datasets and deployed on Jetson Nano edge devices for field testing.The experiments show that the above model achieves excellent performance on edge devices and outperforms related approaches.

Keywords/Search Tags:

AIo T, attention, real-time monocular depth estimation, transformer, style transformer

PDF Full Text Request

Related items

1	Research On Monocular Depth Estimation Technology Based On Self-Attention Network
2	Research On Monocular Depth Estimation Algorithm Based On Deep Learning
3	Research And Implementation Of Light-weight Monocular Depth Estimation Algorithm
4	Research On Monocular Depth Estimation Based On Swin Transformer
5	Research On Single Image Depth Estimation Method Based On Multi-scale Attention Mechanism
6	Research On Self-supervised Monocular Scene Flow Estimation Method Base On Transformer
7	Monocular Depth Estimation For Nighttime Scenes Algorithm Study
8	Research On 3D Human Posture Estimation Algorithm Based On Transformer Model
9	Research On Monocular Depth Prediction Algorithm Based On Deep Learning
10	Real-time Monocular Depth Estimation And 3D Reconstruction