Font Size: a A A

Research Of A Topic-Based Web Collection System Of Tax Source Information

Posted on:2016-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:M MaFull Text:PDF
GTID:2309330470466422Subject:Public Finance
Abstract/Summary:PDF Full Text Request
Taxes are the main source of state revenue and an important tool for the national macro-control. But with the rapid development of China’s economy, the phenomenon of loss of tax revenue has become increasingly prominent, and has affected the normal functioning of the tax function. The root cause of the phenomenon of loss of tax revenue is information asymmetry between the two sides:tax authorities and taxpayers. How to use various means to alleviate the problem of information asymmetry in the tax collection process, has always been the focus of tax research.With the rapid development of information technology, the Internet has penetrated into our daily work life, and the information on the Internet which contains a variety of tax source information has explosive increased. If the tax department can collect and make full use of the tax source information on the Internet, it will alleviate in some extent the problem of information asymmetry between the tax authorities and taxpayers. At the same time, the development of the topic-based web information crawling technology can help us collect the tax source information more quickly and accurately. How to use the topic-based web information crawling technology to collect the tax source information is the main content of this thesis.Firstly, the thesis introduces the main points of asymmetric information theory, and analyzes the performance and adverse effects of the asymmetric information in the tax collection. After a summary of the current solutions to the problem of information asymmetry in the tax collection process, the thesis proposes a new solution:establish a topic-based Web collection system of tax source information. This part is the theoretical basis of this thesis, and it explains the reasons for the establishment of the system.Secondly, on the basis of a summary of the development status of Web information collection system and topic crawler algorithm, the thesis deeply studies the key technologies involved in the topic-based Web information crawling system, including:topic crawler technology, topic-description technology, Web information extraction technology, and topic-correlation judgment technology. This part is the technical foundation of this thesis, to ensure the technical feasibility of establishing the system.Finally, the thesis analyzes and designs a topic-based Web collection system of tax source information. The system can quickly and accurately collect all the topic-correlative tax source information on the Internet. And this tax source information will be stored and downloaded to the local database in accordance with a certain table structure, for users in tax departments to query. In the design process of the system, the thesis uses multithreading technology to crawl the Web information, establishes the vector space model to judge the topic-correlation of the Web contents, and applies the PageRank algorithm based on the analysis of hyperlink text to judge the topic-correlation of the URL hyperlinks.
Keywords/Search Tags:Information asymmetry, tax source information, web information crawling, topic-correlation
PDF Full Text Request
Related items