Subjective And Objective Combination Of Semantic Similarity Algorithm And Its Application

Posted on:2014-02-14

Degree:Master

Type:Thesis

Country:China

Candidate:X D Wu

Full Text:PDF

GTID:2248330395984058

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

As the widespread of personal computer user and the rapid development of Internet, thenumber of Internet user and Internet site is increasing quickly, hence the information on theInternet is also increasing quickly. It is a challenge that how to deal with so much information.Traditional information retrieval methods that based on string matching can not meet now, andSemantic-based information processing emerges.In the field of natural language processing, intelligent retrieval, text clustering and soon,Semantic similarity calculation is a fundamental problem.There are two main ways to calculatewords similarity: One is based on the structure of knowledge which build by linguists, such assemantic dictionary or semantic network, and this methed is called the subjective method. The otheris based on large-scale corpus and this method is called the subjective method. The method which isbased on the structure of knowledge needs linguists to define the information of word, thenaccording to the characteristics of the information to calcutlate the similarity. The method which isbased on large corpus use statistical methods to calculate the similarity.This thesis studies the algorithms based onâ€•Hownetâ€–and large-scale corpus to calculate thewords similarity. An improved objective and subjective combination of word semantic similarityalgorithm is proposed. In the calculation process, the algorithm eliminates interference factors andmakes the result conform both subjective concept and objective semantic environment.The text is one of the most important carriers in the Internet world, and text similaritycalculation is the basis of text classification and text clustering. This thesis proposed a dual-leveltext similarity algorithm. The text is divided into two levels: one is title information and the other istext content information, and text similarity consists of two parts. In the calculation process, thisthesis uses the improved objective and subjective combination of word semantic similarityalgorithm, and makes the result conform both subjective concept and objective semanticenvironment.This thesis has built an experimental platform. By comparing and analysising theexperimental results, this algorithm has improved the results in semantic similarity of words andtexts.

Keywords/Search Tags:

Subjective semantic similarity, Objective semantic similarity, Word segmentation, Ontology, Hownet, Document similarity

PDF Full Text Request

Related items

1	The Research Of Semantic Similarity Computing Algorithm Based On HowNet
2	Sentence Similarity Computing Combining Multi-features Based On HowNet
3	The Research Of HowNet Based Word Similarity Computation And Its Application
4	The Research On Block-based Semantic Similarity Of Sentences
5	Research Of Comprehensive Weighted Word Semantic Similarity Computation
6	Research And Application Of Word Similarity Based On Context
7	An Algorithm For Optimizing Word Similarity In "Knowledge Network"
8	Research Of Hownet Based Word Semantic Computation And Application
9	Research And Implementation Of Subjective Question Scoring System Based On Chinese Word Segmentation And Text Similarity
10	Web Document Automatic Classification Based On Keywords