| Support Vector Machine (SVM) is one of the excellent learners. As a supervised machine learning method, to obtain better generalization performance, sufficient training labeled samples should be created before training the SVM. However, it is often difficult to obtain enough labeled samples in some practical problems, such as spam filtering, medical image detection and so on. Active learning is a method which completely utilizes unlabeled samples to learn. Some "most valuable" samples will be labeled by experts iterative]y, and then they will be regarded as training samples. Combining active learning with SVM, many practical problems can be solved efficiently. This thesis proposes two SVM active learning approaches for different dimensional data, employing SVM as a basic learner, and adopting active learning. The major research contents are summarized as follows:(1) For the low-dimensional data, an SVM active learning strategy based on the distance is proposed, called Dix_SVMactive. This algorithm measured the value of the samples by two distances, which between the unlabeled samples and the current hyper plane, the unlabeled samples and the labeled samples. Then the unlabeled most valuable samples will be labeled by experts.(2) For high-dimensional data, an SVM active learning strategy based on vector cosine is presented, named Cos_SVMactive. This algorithm measured the value of high-dimensional samples by calculating the cosine between unlabeled and the labeled samples. Then, some unlabeled samples which are most valuable will be labeled by experts.(3) Both Dix_SVMactive and Cos_SVMactive are all used clustering algorithm to granulate a given set of unlabeled samples. The initialize training set includes some samples which have the highest relation degrees with the class centers of various types of samples. Then the SVM is trained and the initial classifier is obtained. The most valuable samples are selected for manually labeling based on sample confidence levels defined by the two approaches, and the balance of the training set is adjusted to get a better generalization performance after per iteration. In addition, a new iteration stop condition is also provided by the Cos_SVMactive algorithm.(4) A series of experiments on UCI standard data sets are completed by these two algorithms, respectively. Experiment results demonstrated that, comparing with the traditional SVMactive (Tong SVMactive) and the SVMactive approach based on random sampling, the two proposed algorithms can improve the classification accuracy. The Cos_SVMactive approach has not only a good generalization performance, but also a good convergence. Additionally, for some date sets, the Cos_SVMactive can ensure a higher precision, and at the same time reduce the training time of the algorithm significantly. At present, the active learning has become a hot research issue. Combining the SVM with active learning will solve more practical problems. The results achieved by this work can not only enrich the theory and application research of SVM, but also expend the application range of the SVM and active learning. Therefore, the work possesses important theoretical and practical values. |