Font Size: a A A

Study On The Extraction Of Representative Questions And Answers In Virtual Q & A Community

Posted on:2019-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:B LiuFull Text:PDF
GTID:2416330599964048Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In the era of rapid development of web2.0 technology,questions and answers community websites such as Yahoo!Answers,Baidu Know,Zhihu,etc.have published tens of thousands of questions every day,and some questions have hundreds of answers.Large moments of information on questions and answers make the community heavily overloaded and redundant.That hinders knowledge sharing of the community.Therefore,this paper proposes the representative extraction methods on questions and answers.It separately extracts the representative subset of questions and the representative subset of answers,and it can help users quickly obtain complete and comprehensive questions or answers through these extracted representative problems or answers.This paper first uses the text processing algorithm named LDA to model the original text dataset.Then the k-means clustering is performed using the number of representative problems as the number of clusters,and then we extract a question from each class to form a candidate representative subset.Taking the maximum of coverage and redundancy as the objective function to construct an optimization model to find the optimal representative question subset.For the representative extraction method on answers,the candidate representative subset of the answer is first obtained based on LDA and k-means clustering,and then an optimization model is constructed with the coverage,redundancy,and the answer "number of likes" as an objective function.to find the optimal representative subset.In order to verify the validity of the representative extraction method on problems and the representative extraction method on questions proposed in this paper,the"Zhihu" community was used as the data source,and the questions and answers under the "Smog" and other emergencies topics were taken as experimental data.The method was compared with four benchmark methods.The experimental results show that the representative subsets extracted from the representative extraction on questions and answers presented in this paper are superior to the other four benchmark methods in terms of coverage and redundancy.
Keywords/Search Tags:Q&A community, Representative extraction on questions, Representative extracted on answers, LDA, Clustering
PDF Full Text Request
Related items