| Distance-to-collision refers to the relative distance between the quadrotor drone and the nearest obstacle within the sensing range of the airborne sensor.It is of great significance to research distance-to-collision,since the successful navigation of drone in search and rescue,industrial inspection,precision agriculture and other fileds is inseparable from accurate obstacle distance estimation.Compared to the outdoor environment,distance-to-collision in the indoor environment is more difficult because it contains not only textured and rich-color environments,but also texture-deficient environments.To make matters worse,the GPS signal cannot be received by indoor drones.Therefore,sensors such as laser rangefinder,RGB-D,stereo vision or monocular vision are usually used for indoor environment perception.However,simultaneous localization and mapping algorithms based on laser rangefinders,RGB-D or monocular vision sensors take a lot of time to build a map of the environment.Stereo vision-based depth estimation cannot obtain enough effective features in scenes lacking textures,such as white walls,glass,etc.commonly found in indoors.Compared with these traditional methods,deep learning has a much excellent ability to learn representations from real and complex environments,which makes it widely used in a variety of tasks in computer vision.Additionally,commercial quadrotor drones are often equipped with a forward-looking camera,limiting the use of these solutions.Therefore,obtaining the distance-to-collision through monocular vision is an important task for indoor drone autonomous avoid obstacles.For the distance-to-collision estimation based on monocular vision,some deep learning algorithms are proposed in this paper,including navigable probability estimation based on feature maps region division,distance-to-collision estimation based on ordinal regression,and distance-to-collision estimation based on attention mechanism and multi-module spatiotemporal model.The main innovations of this paper are as follows:1)To solve the problem of slow inference speed caused by repeated extraction of image features,a navigable probability prediction algorithm based on feature map region division is proposed.The algorithm uses a shared convolutional neural network to extract feature maps,and divides the maps into three overlapping rectangular windows along the width direction;then,three classifiers are used to predict the navigable probability of the three windows respectively;Finally,according to the three probabilities,a planning strategy is proposed to realize the autonomous obstacle avoidance flight of the quadrotor drone in the corridor environment.Experimental results show that the inference speed of the proposed model is 5times higher than that of Alex Net.2)To solve the problem that the poor prediction performance of decision regions caused by regression loss function over-train on samples with fewer close-range or long-range features,a distance-to-collision estimation algorithm based on ordinal regression model is proposed.First,a non-uniform discretization strategy suitable for distance-to-collision is adopted to obtain discrete labels: The strategy quantifies the distance-to-collision into three sub-intervals:danger,decision,and safety,and a spacing-increasing discretization strategy is further used to discretize the decision interval into a set of non-overlapping discrete intervals;Then,an ordinal regression model is trained based on the ordering information of the discrete labels;Finally,a distance decoder is designed to improve the decoding accuracy of the decision interval.The experimental results show that the RMSE performance of the proposed algorithm is improved in the decision-making region,which is 48.17% lower than that of Alex Net and1.23% lower than that of Two-Stream.3)Aiming at the problem that the model training time is too long due to many hyperparameters,a distance-to-collision estimation algorithm based on attention mechanism and multi-module spatiotemporal model is proposed.Since discrete intervals are very important to the model performance,the hyperparameters in the non-uniform discretization strategy is changed to select the discrete intervals with better performance.However,it greatly increases the training time of the model.Therefore,a multi-module ordinal regression model based on diversity labels is proposed.To further improve the model’s prediction performance,a feature fusion method based on attention mechanism is adopted.The method uses two independent streams to extract static appearance information from a pair of adjacent time frames.Then,two identical attention mechanism modules are used to encode the feature maps of the two streams respectively,and the encoded two streams are fused into one stream to extract the temporal features.The experimental results show that the training time of the proposed algorithm is more than 50% less than that of a single model. |