Font Size: a A A

The Research Of An Automatic Recommended Model Of Reviewer For A Submission System

Posted on:2010-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiuFull Text:PDF
GTID:2178360275474417Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Automatic text categorization (ATC) refers to the task of sorting a set of documents into categories automatically from a predefined set. As the key technology in organizing and processing large amount of document data, text classification can solve the problem of information disorder to a great extent, and is convenient for users to find the required information quickly. Moreover, text classification has the broad applied future as the technical basis of information filtering, information retrieval, and search engine.The submission system of transaction and learning conference need sorting thousands of papers into hundreds of reviewers to examine and comment. It isn't good that these papers are sort into hundreds of reviewers to examine and comment in a short time, because the matching precision is low. Especially, when we do not know the research subject that the reviewer is well up, we can not collect the information about the research subject that the reviewer is well up in time and accurately. These will baffle sorting the papers into categories normally. How to choose the proper reviewer to examine and comment paper is the key to value papers'quality rightly and step up the status of transaction and learning conference. So how to carry it out that the computer can automatically sort the papers to reviewer who is well up in the papers'subject? Automatic text categorization can solve the problem properly.Aiming at solving the above questions, the paper puts forward an automatic recommended model for the reviewer based on the automatic text categorization. Through the automatic text categorization, computer can automatically sort the submission papers and the published papers of the reviewer into the right subject, so we can judge the subject of the submission papers and the research direction of the reviewer. Then the computer can sort the submission papers to the reviewer who has the matching subject with the submission papers, we can build a automatic recommended model for reviewer by the computer. The main contents and production of this paper includes three aspects:First, in feature filtration, the paper firstly introduces the concept of Max Frequency and the correlation coefficient D ( mik) between the item and the sort, then puts forward the algorithm of ImprovedĪ‡2. The experiment results show that the algorithm of ImprovedĪ‡2 performs well in feature filtration. Second, for the item in an automatic recommended model for reviewer is the key word, this paper predigests the form of the text vector in literature, puts forward the algorithm of vector space model, improve the sorting algorithm of SVM of active learning. The experiment results show that, this algorithm can remove irrelevant yawp availably and improve the quality of categorization model.Third, the ATSVM categorization algorithm has a problem that the categorization speed will be slower with the increase of number for the categorization in many sorts. To solve the problem, this paper introduces DAGSVM which improves the ATSVM categorization algorithm. Results show that the improved ATSVM categorization algorithm can increase the process of alternation and make the improved ATSVM categorization learn by itself. In addition, the improved ATSVM categorization can sort the rate of sort accurately and accelerately.
Keywords/Search Tags:Automatic text categorization, Max frequency, Vector space model, Active learning with support vector machines
PDF Full Text Request
Related items