Font Size: a A A

Analysis And Improvement Of HITS Algorithm On Web Hyperlink-structure Mining

Posted on:2010-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:A H ZhangFull Text:PDF
GTID:2178360278997002Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently, along with the quick popularization and development of the Internet and Web technology, it supplies people with abundant information. Internet constructed based on huge volume of data and its complexity, extreme dynamic and all kinds of clients have made the internet source development difficult.Therefore,locating valuable information in the Web has become the important issue in the area of Web Data mining.The traditional method of information browser has been mature and under the circumstance, we mine huge linkage resource on the Web according to the attribute of it.Then we search and build the Web indormation retrieval model to find information we need.The current method of locating the ring web page is based on the hyperlink ranking algorithm.However,such method may cause the topic drift problem,which is the results of algorithm is often irrelevant with the searching topic,but has high link density.By studying the classical Web structure mining algorithm HITS and considering that the HITS only calculates the hyperlink among the web and ignores the content of web result in the drawback of topic drift, we propose an improved HITS algorithm—G-HITS that combines hyperlink analysis and content analysis.The new algorithm improves the HITS by analyzing the content of the web and giving the hyperlinks with different weight.And the experiment proves the new algorithm effective.
Keywords/Search Tags:Web, structure mining, hyperlink, HITS, G-HITS
PDF Full Text Request
Related items