| With the popularity of Internet technology and intelligent digital product development, itis easy to obtain and use images. New images need to be classified for people. In imageclassification tasks, the traditional machine learning has high demands on training set imagesamples, and it makes the image classification spend a lot of manpower and resources. Whenthere are a small number of labeled image samples, image classification accuracy is greatlyreduced. To solve the problem, self-taught learning algorithm introduces a lot of randomimages as the assist sample. Because of the random images can be downloaded from thenetwork and do not need to label, it saves a lot of manpower and resources. Self-taughtlearning algorithm becomes a research hotspot in the field of image classification based onmachine learning.Self-taught learning algorithm consists of three steps: extract basis vectors, imagereconstruction and training the classifier. The basic idea of the algorithm is extract basisvectors from a lot of random images. To obtain reconstruct coefficient, reconstruct labeledimages and tested images by the basis vectors. Train a classifier by the reconstruct coefficientsof labeled images. Classify the reconstruct coefficients of test images by the classifier. Thearticle improves the algorithm aimed at its shortcomings, and applied to the imageclassification to experimental analysis,compared with other algorithm. This research workmainly includes the following several aspects:Firstly, the classification accuracy is affected by the randomness of feature points’selection. In order to solve this problem, a self-taught learning classification algorithm basedon image object feature space is proposed. First, use the local multi-channel active contourmodel based on the color and texture feature to find the image object area. Then, selectfeatures in the target area and make sparse coding on features, which is used to establish thefeature space. Experiments indicate that the proposed algorithm can avoid the impact of therandomicity of feature points’ selection, and effectively improve the accuracy of imageclassification.Secondly, the classification accuracy is affected by the randomicity of unlabeled imagedata and the lack of label image data. In order to solve this problem, a transformedsupervision self-taught learning algorithm of image classification is proposed. First, it extractsimage object local features, gets the relevant images by retrieve the image library, and obtainsrepresentations from the data of the random under supervision of labeled images and therelevant images. Reconstruct labeled images by the basis vectors to obtain reconstructcoefficient of labeled images. Then,trains a classifier with the reconstruct coefficients oflabeled images. Reconstruct unlabeled images by the basis vectors to obtain reconstruct coefficient of unlabeled images. TSVM algorithm related to the use of unlabeled imageclassifier optimized to obtain a better image classification results. Experiments indicate thatthe proposed algorithm can effectively improve the accuracy of image classification.Lastly,in self-taught learning algorithm, classification model does not classify asupdated with the classification of the test image. This article proposed self-taught learningalgorithm based on incremental HDP by incremental HDP mode and self-taught learningalgorithm. Firstly, the incremental HDP model trains classifiers by labeled image samples. Inthe test image classification process, while the output classification results and the feedback todetermine whether the image training set to update the classification according to riskidentification mechanisms. The algorithm further alleviate the image classification adverselyeffects caused by the small sample size. Experimental results show that, the algorithmclassified correct rate gradually improved with the image classification tasks, and the beforeself-learning algorithm classification rate remained unchanged. |