Font Size: a A A

Research On Deep Learning Methods When Learning With Noisy Labels

Posted on:2024-11-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:L H DengFull Text:PDF
GTID:1528307373970039Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep Neural Network(DNN)has achieved impressive performance in various fields in recent years.However,the success of DNN depends on a large amount of correctly labeled data,and in many real-world scenarios,the datasets used for training DNN contain noisy labels(labels that are incorrectly annotated),which can affect DNN’s performance.To tackle this issue,Learning with Noisy Labels(LNL)methods,which aim to improve DNN’s performance when trained with noisy labeled datasets,have attracted extensive attention.Nonetheless,DNN’s application in practice may encourter complex scenarios,thus existing LNL methods still have limitations.Therefore,this dissertation delves the design of noisy label robust loss function,analyzes the difference between noisy sample and hard clean sample,enhances the semantic features used by existing LNL methods,and mitigates the drawbacks of existing LNL methods when facing Instance-dependent label noise,and proposing corresponding solutions.Specifically,the main research of this dissertation includes:1.In practical application scenarios of DNN,the noisy label problem often coexists with the class imbalanced problem.However,there seem to be few noisy label robust loss functions that can simultaneously mitigate both problems.Thus,this dissertation propose a LNL method that constract a Noisy Label and Negative Sample(samples that belong to the majority class in class imbalanced datasets)Robust Loss function(NLNSRL).Specifically,this dissertation quantifies the negative impacts of the noisy label and class imbalance problems,then constructs a loss function that can minimize these negative impacts through a linear programming algorithm.By using the Distantly Supervised Relation Extraction(DSRE)task and named entity recognition task,which frequently suffer from both problems,as examples,this dissertation demonstrated the effectiveness of the proposed method.Experimental results indicate that the proposed method can effectively improve DNN’s generalization performance in practical scenarios.2.Popular LNL methods often regard samples with high learning difficulty(high loss value)as noisy samples.However,irregular feature patterns from hard clean samples can also cause high learning difficulty,which can lead to the misclassification of hard clean samples as noisy samples.To mitigate this insufficiency,this dissertation propose the Samples’ Learning Risk-based Learning with Noisy Labels(SLRLNL)method,which utilizes samples’ learning risk,representing the samples’ influence on DNN’s accuracy,to separate hard clean samples from noisy samples.Thus,compared to existing LNL methods,SLRLNL can better separate hard clean samples from noisy samples.Moreover,to improve DNN’s learning for hard clean samples,this dissertation further proposes the Relabeling based Label Augmentation(RLA)method to encourage the DNN to occasionally learn different possible labels from hard clean samples,thus enhancing the learning for these samples.The experimental results on the commonly used noisy labeled image classification dataset and relation extraction datasets demonstrate the effectiveness of SLRLNL.3.Existing LNL methods often rely on the semantic features extracted by the DNN to detect and mitigate label noise.However,these extracted features often contain unstable correlations with the label across different data environments,which can compromise the efficacy of LNL methods.To mitigate this insufficiency,this dissertation propose an Invariant Feature-based Label Correction(IFLC)method.The IFLC method enhances the learning for invariant features that maintain a stable correlation with the label and improves the utilization of the captured features.Specifically,this dissertation first proposes the Label Disturbing(LD)method to encourage the DNN to attain stable performance across different data environments,thus guiding the DNN to learn invariant features.Then,this dissertation proposes the Representation Decorrelation(RD)method to enhance the linear independence between different dimensions of DNN’s representation vector,thus guaranteeing accurate utilization of the learned features.Finally,this dissertation utilizes robust linear regression to apply DNN’s feature representation for label correction.The effectiveness of the proposed method was evaluated on four commonly used noisy labeled image classification datasets,and the IFLC is compared with the state-of-the-art(SOTA)LNL methods.The experimental results indicate that IFLC achieved comparable or even better performance to the SOTA LNL methods.4.Existing LNL methods primarily rely on DNN’s memorization effect to detect and mitigate the negative impact of noisy labels.The memorization effect assumes that the DNN learns noisy samples slower than clean samples.However,this effect becomes defective with Instance-dependent label noise,thus degrading the real-world performance of label correction methods.To mitigate this insufficiency,this dissertation proposes a Separable Feature Representation-based Label Correction(SFRLC)method to effectively detect and correct Instance-dependent label noise.The SFRLC method encourages the DNN to represent different semantic features through diverse feature representation dimensions by decorrelating the joint vector of DNN’s outputs from different layers,thus slowing down the DNN’s learning for noisy samples.This dissertation evaluates the proposed method on three common noisy labeled image classification datasets,and the experimental results showed that the proposed method outperforms other LNL methods in handling instance-dependent and real-world label noise.
Keywords/Search Tags:Learning with noisy labels, Deep neural networks, Label correction, Noisy label robust loss function, Feature representation
PDF Full Text Request
Related items