| Leaf vein pattern is one of the unique physiological characteristics of plants.Veins at different levels play various roles in the growth of the plants.Thus,leaf vein segmentation task is crucial in plant species identification,quantitative phenotypic descriptions and genetics research.Current research focus on the use of manual feature extraction algorithms for leaf vein segmentation,which have low levels of automation and normal generalizability.In addition,most research aim to segment the vein pattern from leaf images while ignoring the distribution of veins at different levels.To address the problem of leaf vein segmentation task,we first propose the Transformer variant termed LV-Segformer for vein pattern extraction.The LVSegformer consists of three modules: efficient backbone,lightweight multi-stage alignment(MA)module and accurate hybrid decoder.The Transformer backbone generates multiscale outputs with fewer parameters and the lightweight MA module aligns and fuses these features by depthwise separable convolution.While the hybrid decoder refines the efficient feature recovery by embedding global semantics and local details simultaneously.In addition,skip-connections between encoder and decoder are introduced and the strip pooling module suppresses background noises.To make up for the lack of public leaf vein segmentation datasets,we construct two benchmark datasets to evaluate the performance of the model.All the results show that LV-Segformer achieves state-of-the-art(SOTA)performance with fewer parameters.For the fine-grained leaf vein segmentation task,we design the self-knowledge distillation(SKD)module based on the LV-Segformer model.Specifically,the LVSegformer model serves as the self-teacher network,while the student network consists of the encoder and the knowledge distillation module.The teacher network improves its performance by comparing ground truth images,while the student network continuously refines the backbone by learning deep features generated by the teacher network.The SKD module introduces four auxiliary branches and K-L loss function in the training phase and discards all auxiliary branches during inference stage.To validate the performance,two three-level leaf vein segmentation datasets are constructed.Compared with other CNN and Transformer models,our proposed fine-grained segmentation model achieves the SOTA performance and better generalizability. |