Answer Selection For Non-factoid Question

Posted on:2014-03-29

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Tian

Full Text:PDF

GTID:2268330422950614

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of question answering communities, more and more usergenerated content has been accumulated. These user generated content has not only thehuge amount and great varieties, but also has high qualities and values of reusing. Inorder to manage and utilize these resources, researchers has been done a lot of studiesand works these years, and the community question answering is one of those areas thatattracted most attentions.Community question answering (cQA) is based on the data of question answeringsites which is quite different from traditional question answering system. Traditionalquestion answering is focusing question understanding and answering extraction tosolve those factoid questions whose answers are mainly phrases and named entities.While, cQA doesn’t have those constrains on question types, and it has great advantageof solving those question which asks for advices and opinions. Studies on cQA covermany areas such as question search and routing, question interesting, question andanswer quality, answer ranking, user expertize. What’s more, question search andanswer selection as the key component of cQA have drawn much attention of bothacademia and industry.The main work of this paper is building a cQA system based on huge amount ofquestion answering data and developing and devising methods on questionunderstanding, question search and answer selection.When building the cQA system, this paper collected over130million questions and1billion answers for Yahoo! Answers and other question answering site. The size of datais much larger than any of previous studies on cQA which shows the efficient andpractical of my method. Based on these data, this paper applied an automatic method ofclassifying question query into different categories to improve the efficiency and effect.In question search, this paper proposed a way of using learning to rank algorithm tocombine different levels of structural and semantic features extracted from questionquery and questions, which aims to solve the term mismatch problem between questionquery and question. The experiment shows that the ranking model trained by RankingSVM is better than baseline methods on different dataset in evaluation metrics ofprecision and so on.After getting relevant questions of question query from question search, this paperdevised a new unsupervised method for detecting low quality answers by usingcontent-based features. The method is based on three assumptions:(1) most answersunder the question are normal and only a few of them are low quality ones.(2) Lowquality answers can be detected by check these peer answers under the same question.(3) Different question should have different criterion on answer quality. Based on the assumption, this paper used method to minimize the data variance of answers featurevectors and keep the most number of answers at the same time. The experiment showthe method improved the ROC result of baseline methods.After filtering low quality answers, this paper also applied a Ranking SVMalgorithm to rank the answers by using content and user expertize features. Byevaluating300high frequency question queries from query logs of commercial searchengine, this paper got a78%accuracy of answering the question query. After all aboveprocedures, this paper built an efficient and effect cQA system which can gives ananswer in2seconds for any query.

Keywords/Search Tags:

community question answering, question search, answer quality, Ranking SVM

PDF Full Text Request

Related items

1	A Study Of Ranking Methods For Searching In Community Question Answering
2	Research And Application Of Key Technologies Of Community Question Answering
3	Research And Application Of Answer Ranking And Question Retrieval In Community Question Answering System
4	Mutual Promotion Of Question Retrieval And Answer Ranking In Community Question Answering
5	Research On Question Correlation And Answer Ranking Based In Question Answering Community
6	Research On Question-type Sensitive Answer Summarization In Community Question Answering
7	Research On Key Techniques In Community Question Answering Site
8	Comprehensive Information Based Community Question Answering System
9	Research On The Re-use Of Community Question Answering Knowledge
10	Research On Candidate Answers Ranking For Temporal Question