| Vision is an important channel for people to perceive the world,and as the world’s population increases,so does the number of people who are totally blind or have visual impairment.The lack of vision not only brings harm to them from the mind,but also brings great inconvenience to their life,especially in outdoor travel activities.With the advancement of technology and social development,they are receiving attention from various sources,especially in the area of outdoor navigation and obstacle avoidance for the blind.Currently,there are many devices and obstacle avoidance systems being developed.The main principle of most of these devices is to detect the distance of obstacles using ultrasonic,infrared sensors and LIDAR,and to indicate the distance of obstacles to the blind through vibration or beeping.In recent years,with the rise of the third boom of artificial intelligence,machine vision has become a mainstream research hotspot,and its applications are very extensive.The application of machine vision is mainly used to identify the scene in which the blind person is located,to recognize and detect obstacles,to recognize faces and to detect text signs,etc.Compared with the traditional methods of guiding the blind,the use of machine vision technology can bring greater convenience to the blind in their travel and daily life.In this paper,we propose a deep learning-based method for detecting and tracking obstacles in outdoor blindness,which provides a new detection model and method for the development of blindness guidance system and blind navigation.The main work and results achieved in this paper are as follows.Firstly,we analyzed the spatial information needs of blind travelers,and categorized the common obstacle elements around blind paths into 15 categories by finding relevant literature and actual surveys,based on which we further established the obstacle target dataset and instance segmentation dataset and divided them into training set,test set and validation set,and analyzed each index of the dataset.Then,for the established obstacle dataset,a series of algorithms such as YOLO,SSD,and Faster R-CNN are used to train the obstacle detection model and optimize and improve it for the obstacle dataset.When using the YOLOv3 algorithm to train the data,the K-means clustering method is selected to generate the anchor for the characteristics of the obstacle dataset,which improves the average detection accuracy of the model.When using the SSD algorithm training data,in order to improve the accuracy of multi-scale target detection among them,the SSD algorithm is improved by adding Conv3_3 layers and modifying the generation ratio of the prior frame,and the improved algorithm improves the accuracy of multi-scale target detection more obviously.For the detection and recognition of blind lanes and crosswalks,Mask RCNN instance segmentation is used to build the dataset,and the blind lane and crosswalk instance segmentation detection model is obtained using FPN+Res Net101 and FPN+Res Net50 training datasets,which can detect the blind lanes and sidewalks in images with high accuracy.Finally,combining the trained YOLOv3-Kmeans and YOLOv4 obstacle detection models,the models under these two frameworks are parsed and a target tracking model is built using the deep learning module in Open CV.The target detection models are used to detect and track video streams under these two frameworks,and the experiments show that the target detection models can be used to track targets with good results.Analysis of the tracking effect of both models for continuous video frames shows that the YOLOv3-Kmeans-based model has better tracking effect with less missed detection. |