Font Size: a A A

Hashing And Sensitivity-based Undersampling For Imbalanced Classification And The Application In Large-scale Pathological Image Classification

Posted on:2022-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:L QiuFull Text:PDF
GTID:2480306569981089Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Early detection and diagnosis of cancer plays an important role in the treatment and nursing of patients.Histopathological analysis is the gold standard for the diagnosis of precancerous lesions.However,due to the changes of appearance,heterogeneity and texture,it is time-consuming and laborious to evaluate large-scale histopathological cancer manually,and it often depends on human subjective explanation.In recent years,with the development of slide scanning technology and the reduction of digital storage cost,making whole slide image(WSI)from histopathological stained sections and designing a computer-aided system has attracted extensive attention.In practice,it is necessary to provide tumor proliferation assessment at the whole WSI level.It is not advisable to diagnose and locate tumor metastasis directly on gigapixel WSI because it requires a lot of memory consumption.Existing technology usually divides WSI into small pieces for further classification.However,since the number of benign samples is much larger than that of malignant samples,class imbalance seriously affects the classification performance.Data resampling is the most common method to solve class imbalanced classification in the field of medicine and even machine learning.However,existing resampling methods usually use distance-based neighborhood relationship to extract the distribution of data.For large-scale and high-dimensional data sets,this kind of method will bring very low computational efficiency.In the noisy dataset,the representation ability of minority class is insufficient.The neighborhood based resampling method is easily affected by noise,which leads to the unreasonable resampling strategy.In view of shortcomings of existing undersampling methods,this thesis proposes an online weighted sampling method based on hashing and sensitivity by combining neural network classifier training with sampling.The main idea of the method proposed in this paper is to select more valuable samples for the current classifier by giving higher weight to the samples with higher classifier sensitivity.This method does not need to calculate the distance between samples,which makes it very suitable for processing large-scale and high-dimensional data sets.In addition,this method does not need to discard samples,which overcomes the disadvantage of information loss caused by direct discarding samples.Experimental results show that the proposed method yields a better classification performance and a higher efficiency in dealing with large-scale class imbalanced data of histopathological images.
Keywords/Search Tags:Cancer, histopathological image, class imbalanced classification, undersampling
PDF Full Text Request
Related items