| Fine-grained Visual Categorization(FGVC)is an active area in computer vision research.Due to the high professional requirements for annotators,the cost and difficulty of labeling large-scale fine-grained image classification datasets significantly exceed that of general ones.To open up data sources for FGVC research,the research community has begun to turn its attention to data collected from the web.Building a dataset from the web is easy,but the mislabeled items far exceed that of manual annotation ones.Thus,training FGVC models with web data should tackle the label noise problem.We refer to the semantic inconsistency between the image and label as noise: “out-of-distribution noise” are irrelevant samples,“in-distribution noise” is the mislabeled samples.The proportion of noise in the web dataset can even reach a quarter of all samples,far exceeding the level that can be ignored.Since fine-grained image classification is more dependent on learning representation than coarse-grained tasks,excessive noise in the dataset will inevitably lead to training failure.Therefore,to train FGCV models on web data,it is necessary to study the denoise training methods of FGCV.The regular denoise strategy is to select clean samples based on loss value.Clean samples stay and noise ones are eliminated.Regular strategy is simple yet effective,however,it cannot handle the in-distribution/out-of-distribution noise separately,and the denoise process will drop the hard examples as noise ones arbitrarily.Excessive removal does avoid noise while weakening the model since insufficient training.This dissertation mainly studies how to effectively train an FGVC model based on noisy web databases from the perspective of denoise.The specific achievements are as follows:(1)The dissertation clarify the definition,sources,and characteristics of noise in web datasets summarize the development of learning with noisy data and the strategies, analyzes the challenge of training the FGVC model from noisy data,details the collection method of the webly supervised FGCV datasets(Web-Aircraft,Web-Bird,and Web-Cars)which will use in the subsequent chapters.(2)On the tackle of out-of-distribution noise,we propose a denoise training algorithm based on soft label cross-entropy tracking.The algorithm uses soft label cross-entropy to track out-of-distribution samples,and then proposed a global sampling method to drop the noise samples.Finally,feature normalization and label smoothing are adopted to improve classification performance.Compared with the Co- teaching method,the proposed algorithm achieved an ACA improvement ranging from 4%-10%.(3)On the problem of fevering easy samples while dropping noise samples arbitrarily, a denoising training algorithm based on label distribution learning is proposed. Commonly,the low-loss samples are subjectively identified as clean samples and the high-loss samples are filtered out as noise.The proposed label distribution learning method can estimate the label probability distribution of high-loss samples while training.The proposed method integrates a self-supervised learning module to train the FGVC model effectively.Compared with the benchmark algorithm Co-teaching,the proposed method has achieved accuracy improvements of 4.29%,4.34%,and 4.22% on Web-Aircraft,Web-Bird,and Web-Car respectively.(4)On the simultaneous existence of out-of-distribution and in-distribution noise,an FGCV denoising method based on prediction fidelity is proposed.The prediction fidelity is calculated from prediction history.The prediction fidelity method can be used to simultaneously eliminate irrelevant noise and dynamically correct the in-distribution noise.Compared with the latest method,the proposed FGCV de- noising algorithm has achieved an ACA improvement of 2.16%,1.2%,and 3.43% on Web-Aircraft,Web-Bird,and Web-Cars SELFIE.By proposing new label denoising training methods,the performance gap between the webly supervised and the strongly supervised scheme has been narrowed to 8%.The research contents of the paper have been published in top journals and conferences in the field,which can prove the effectiveness of the proposed methods. |