With development of information technology, more and more people publish and access information by internet and it became an indispensable part of our lives. However, as the amount of available data grows, the problem of managing the information becomes more difficult, which can lead to information overload. The purpose of the topic detection and tracking (TDT) is to develop automatic methods of identifying topically related stories across multiple media and in different languages, which makes that it is easy to access and manage information.In this thesis, we construct a topic TDT system used k-nearest neighbor (KNN) algorithm, because firstly the KNN is one of best text category algorithms and secondly it was used to construct the TDT system by [2] and got good result. The traditional KNN assumes that the distribution of training data is even, while it is not in the TDT system. This paper proposes an approach which adjusts the score of every category to overcome this obstacle. Moreover it costs lots of time using KNN because it must traverse all of the old stories to get the nearest neighbors. In this paper, we use the feature projection algorithm to speed the KNN which gets a competitive result. |