| Object detection and semantic segmentation are both classic problems in the field of computer vision.Object detection needs to accurately locate the position of the object of interest in the image and obtain its category information,which belongs to the objectlevel task? while semantic segmentation needs to classify the image at the pixel level and predict the category of each pixel,which belongs to the pixel-level task.Both of them are used to extract the position and category of the object from the image,and are suitable for classification work under different precision requirements according to their respective characteristics.When performing object detection on complex images,due to object occlusion and information redundancy,there are problems such as inaccurate detection frame positioning and false detection.Therefore,to address these problems,this thesis proposed the YOLOv5-SPConv model,which achieves accurate multi-object detection.The above object detection method is more applicable when the label is complete and the object position is detected by a rectangular frame,and semantic segmentation can achieve a finergrained pixel-by-pixel classification of the object.Aiming at the lack of complete labels in semantic segmentation,and the low accuracy of segmentation of small areas and edges in images,this thesis proposed a semantic segmentation method Adapt Seg Net-SR based on domain adaptation,which significantly improved the m Io U of semantic segmentation and greatly reduced the time cost of label production.Furthermore,this thesis migrated the above-mentioned research on object detection and semantic segmentation methods to online teaching,and built a learning content detection system for Metasequoia online teaching videos,which provided a data basis for the construction of online teaching video resource libraries.The specific work of this thesis is as follows:(1)Object detection method based on YOLOv5: Based on the low accuracy of object detection and mutual occlusion of objects,in order to reduce the influence of feature redundancy and object occlusion in the model on object detection and improve the detection accuracy,this thesis proposed a object detection algorithm based on YOLOv5.This thesis introduced SPConv and Soft_NMS into the model,and verified the superiority of the model on the commodity outbound data set with complex object spatial position relationship,improved the Precision,Recall and m AP of object detection to 98.320%,96.380% and 98.850% respectively,and reduced the misdetection of objects.(2)Semantic segmentation method based on domain adaptation: For the lack of complete labeling of images used for semantic segmentation,the existence of edges and small area regions in the image with low segmentation accuracy,this thesis proposed a more fine-grained image semantic extraction method: Adapt Seg Net-SR,a semantic segmentation method based on domain adaptation.In order to make the model focus on more important channels and improve the problem that the domain adaptation method may lose edge information,this thesis introduced a self-attention mechanism and remote sensing spectral indices in the segmentation network.Finally,this thesis verified the effectiveness of the model on remote sensing data sets with irregular object shapes and large differences in data set distribution,and improved the m Io U of image segmentation to 78.11% without using all target domain data set labels.Compared with the commonly used semantic segmentation methods such as Adapt Seg Net and Deep Labv3+ without domain adaptation,it had a significant improvement.(3)Online teaching video detection system based on object detection and semantic segmentation: In order to better integrate online learning resources and provide a data basis for the teaching process,this thesis applied the proposed object detection and semantic segmentation methods to the online teaching field,and studied the object detection and semantic segmentation of multi-modal data.Finally,this thesis took the Metasequoia online platform as an example to realize the online teaching video detection system.Through the detection and segmentation of the teaching video content,the important and difficult points in the teaching content are further analyzed,and the data basis for the construction of video resource libraries is provided.To sum up,this thesis studied object detection and semantic segmentation methods in the field of computer vision respectively,and combined the two methods and applied them to the detection and segmentation of online teaching video content.By analyzing the important and difficult points in the teaching video content based on the detection and segmentation results,it provided an important basis for teachers and students to understand the composition of video content and the distribution of important and difficult points. |