Semantic image segmentation plays an important role in the fields of autonomous driving scene perception and autonomous navigation.In recent years,deep learning based methods have made continuous improvements in such challenging tasks.However,most of the existing methods usually require endless pixel-level annotations and particularly,well-trained models generally fail to segment objects of novel categories.To solve these problems,this thesis proposes novel semantic image segmentation methods with fullysupervised learning,few-shot learning,and zero-shot learning.Aiming at autonomous driving scene perception,we further investigate the application of the proposed methods in intrusion behavior recognition of unknown objects.The main contributions in this thesis include:1.For various scales of objects in semantic image segmentation with a large number of labeled samples,this thesis proposes an attention-based multi-scale feature aggregation method.This method employs an attention mechanism to adjust the weights of different channels in intermediate features,and adaptively selects the features from different scales for fusion.The experimental results on PASCAL-VOC and autonomous driving scene perception dataset Cityscapes show that our method surpasses the existing single scale model by 5.6%mIoU and 7.6%mIoU,respectively.2.For the problem of incomplete and inconsistent segmentation caused by a few labeled training samples,this thesis proposes an attention-aided LSTM based method for few-shot semantic segmentation.The main idea is to estimate the areas of target categories based on attention mechanism,and introduces an LSTM module to continuously update and optimize the segmentation results of target categories.In one-shot semantic segmentation experiment,the proposed method exceeds the baseline model by 5.3%mIoU on PASCAL-VOC dataset and achieves the best accuracy of 83.0%mIoU on FSS-1000 dataset.In addition,the proposed method is tested on Cityscapes dataset and achieves competitive performance of 21.0%mIoU.3.For zero-shot semantic segmentation where novel categories do not have any labeled training samples,this thesis proposes a multi-modal semantic segmentation method based on regional disentanglement.The method transfers semantic information of different categories by utilizing similarities among word embedding vectors.Meanwhile,a new regional disentanglement strategy is designed to significantly alleviate the prediction bias towards base categories.The proposed method achieves the best accuracy averaged on three different scales of datasets,which surpasses the previous state-of-the-art method by 8.47%hIoU and 8.35%mIoU.On the autonomous driving dataset Cityscapes,our proposed method outperforms the baseline by 12.71%hIoU and 7.83%mIoU,respectively.4.Finally,aiming at the intrusion behavior recognition task of unknown objects in autonomous driving,this thesis investigates the application of our fully-supervised,few-shot,and zero-shot semantic segmentation methods.First,a lane intrusion behavior recognition dataset is collected.Then we propose a cascade model that extracts the features of unknown objects using the three semantic segmentation methods,and predicts the intrusion behaviors based on extracted features.All our methods achieve satisfactory performance in the intrusion behavior recognition task. |