| Today’s world is an information society, information the government, thecompany’s most important content, and the Internet (Internet) is a huge network, theinformation in the vast, naturally became the government agencies and enterprises themain source of access to information.However, most of the current field of use of Internet information units still in therelatively simple state, the most common application of Internet information throughsearch engines to find; However, this approach has many limitations: first, too muchirrelevant information more, interferes with the normal information search; followed bymixture of old and new information is difficult to find the latest information; and searchthe information has not been sorted and is especially chaotic.In order to meet the government agencies and enterprises field of professionalintelligence gathering network requirements, design and implementation of fieldrequired network intelligence collection system.System uses the network information collection engine (Network InformationCollect Engine), the pre-specified time copy the contents of multiple websites to a localdatabase and the use of full-text search (Full-Text Search) technology and documentsimilarity (Document Similarity) Identification information on the identificationinformation database, the contents are basically the same document merge together, thusmore conducive to users to retrieve and use.System features include1, the task of collecting and flexible customization.2, multi-target data source management.3, for different target data sources, collected for different configurations, to ensurethat the collected data.4, the acquisition task scheduling management, synchronized with the target site,incremental acquisition.5, the results of the collected data, heterogeneous data into homogeneous completeprocess management. 6, collect the results of release management. By Publisher, to publish data to theapplication platform. |