Chinese sentence similarity computation is an essential task and widely used in theChinese information processing. It can decide the development of certain relatedresearch directions. For example, in the area of automatic question-answering, EMBT,information retrieval etc, how to compute the sentence similarity is one of the mostimportant problem which is also a hotspot and very difficulty that people study for along time. During the research of Chinese sentence similarity computation, the similaritycomputation that we have studied is focus on three levels: sememe, word and sentence.It is based on the feature of Chinese,that is the word is composed of morphemes, andthe sentence is composed of words. Although three levels are different, from thesimilarity computation to its applications, it is a gradually process with closerelationship as a whole. The main innovative achievements of this paper are as follows: First, the extraction method of question intention is presented which is based onthe research of question intention. Question intention is the surface meaning which thequestion wants to express, and equals to the feature of sentence object layer. Analyzingmuch corpus, the question is divided into three types: question-word questions, A-not-Aquestions, sentence-final particle sentences. Different question types have differentways of extract intention, according to the question type, intention extraction method isput forward. Secondly, we have studied the method of computing semantic similarity betweenChinese words. Using the abundant semantic information supplied by HowNet semanticconcept relation net, we compute the HowNet-based Chinese words semantic similarity. Thirdly, the Chinese sentence similarity computation of multi-levels andmulti-features fusion is presented. This method makes the best use of the sentenceinformation about object level, structure level and semantic level. Several features suchas question intention, keywords set, sentence length, noun number, verb number, propernoun number etc are extracted. And it gains an integrated feature as value of sentencesimilarity computation using fusion algorithm with simplicity and effect. Fourthly, taking the natural language question answering system in financial fieldas the examples, we show the important roles that the Chinese sentence similaritycomputation has been in practice. This research can contribute to some domains in Chinese information processing, itwill be valuable and have good prospect to a certain extend. |