| Character Recognition system is one of the most important fields in Pattern Recognition. The primary process to solve this problem is based on statistic. Researchers need to gather a lot of unlabelled samples and trade them, then get a classifier. But this work is time-consuming and boring.So we propose an OCR way which is based on both the theory of active learning and support vector machine. The differences between this system and other OCR systems are: it doesn't need to label all the samples which should be trained; it can query with people any sample which is not sure about, use the classifier according to the given labels so to reduce the trivial and complicated label work; Further more, the system can improve itself by asking user.This thesis introduces some theories that we used. Firstly, we introduce some theories about optical character recognition. The important steps of the OCR system are pretreatment and feature extract. Secondly, we propose a complex method in this OCR system, including both the active learning and support vector machine. Thirdly, we use a bound of data to prove the correctness of this method. Finally, we use this idea to design a software system of OCR. |