| Weakly supervised fine-grained image recognition(WFGIR)pays attention to learning to distinguish hundreds of subcategories in each basic level category with only image level labels available.It is an extremely challenging task and existing approaches mainly focus on the discriminative semantic parts/patches localization as the key variances among different subcategories are subtle and local.However,they localize these regions independently and directly from high-level feature maps while neglecting the fact that regions are mutually correlated and region groups can be more discriminative.Besides,we discover that due to the operation of stacking local receptive filed,Convolutional Neural Network causes the discriminative region diffusion in high-level feature maps,which leads to inaccurate discriminative region localization.We mainly divide the problems into two sub-issues: 1)region grouping learning,which fully mines and exploits the discriminative potentials of correlations through correlation-guided discriminative learning and graph propagation correlation learning to accurately and implicitly localize the discriminative region group;2)low-rank mechanism learning,which address the problem of discriminative region diffusion and find better fine-grained details through learning a set of discriminative low-rank bases to resume the low-rank feature maps.For the region grouping learning,we propose an end-to-end Correlation-guided Discriminative Learning(CDL)model to exploit the discriminative region group through correlation-guided learning and strengthen the discriminative elements while suppress the useless ones in the discriminative feature vectors.We also propose an end-to-end Graphpropagation based Correlation Learning(GCL)model to strengthen the single region by considering its context information and mine the region grouping simultaneously through graph propagation correlation learning,and then perform discriminative interaction between selected discriminative patches through Graph Convolutional Network(GCN).For the low-rank mechanism learning,we propose an end-to-end Discriminative Featureoriented Gaussian Mixture Model(DF-GMM),to address the problem of discriminative region diffusion and find better fine-grained details.We argue that due to the operation of stacking local receptive filed,Convolutional Neural Network causes the discriminative region diffusion in high-level feature maps,which leads to inaccurate discriminative region localization.The proposed method learns a set of discriminative bases through Gaussian Mixture Model(GMM)iteratively to accurately select discriminative details and filter more irrelevant information in high-level semantic feature maps,and then resumes the original space information of low-rank discriminative bases to reconstruct the low-rank feature maps.Extensive experiments demonstrate the effectiveness of proposed networks,and show that the models achieve better performance both in accuracy and efficiency on CUB Bird,StanfordCars,and FGVC Aircraft datasets. |