Font Size: a A A

Semantic Web Service Discovery Based On LDA Clustering

Posted on:2017-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:L P CaoFull Text:PDF
GTID:2308330485961771Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Internet and distributed system, Service-Oriented Architecture (SOA) is widely used in both academic circles and industrial circles. As a new distributed computing model based on Internet standard and XML technology and a key technique for realizing SOA, Web service has become a hot issue. With the exponential growth of Web services, it is valuable to efficiently locate desired Web services and select the best one from a group of functionality-similar Web services, which are exactly the tasks of Web service discovery.The common methods for Web service discovery are the logic-based methods, the non-logic-based methods and the hybrid methods. Wherein the logic-based method relies on reasoner and integrity of inference rules, it has higher accuracy, but less flexibility and feasibility; While the non-logic-based method is difficult to determine a better similarity calculation function, and the merits of similarity calculation function is also difficult to ensure; The hybrid method can take full advantage of the various methods, and the various methods can be combined to reduce the disadvantage, and many experiments show that hybrid method has obvious advantages.Based on the previous work, we propose a semantic Web service discovery method based on LDA clustering, and it is a hybrid method of Web service discovery. Firstly, the OWL-S Web service documents are parsed to obtain the document word vectors. Then the document word vectors are extended to make the documents more abundant of semantic information. Moreover, the document word vectors are modeled, trained and inferred to get the Document-Topic distribution, and the Web service documents can be clustered. Finally, we search the Web service request records or the Web services clusters to find the Web services that meet the requirements. The main contributions of our work include:(1) Document parser. We propose a novel document parsing method. Firstly, the OWL-S document is parsed so as to obtain the service name, service description, inputs and outputs, and the service name and service description, which are processed by stop words, stemming to obtain the document word vectors. In order to enrich the semantics of the document information, the OWL document corresponding to OWL-S document is parsed to get the concepts of equivalence class, parent class, ancestor class, subclass, descendant class of inputs and outputs, then add them to document word vectors. Moreover, we use WordNet and Word2Vec to extend ten initial document word vectors by searching high similarity words. In the end, we merge the entire document word vectors to get the extended document word vectors. And the extended document word vectors have rich semantic information.(2) Document cluster. We implement a clustering method based on probabilistic topic model. The LDA topic model is established for the extended document word vectors, and Gibbs sampling algorithm is used to train and infer, so as to get the Document-Topic distribution. Moreover, the LKMSIMPClustering clustering algorithm we proposed is used to cluster all the documents to obtain the Web service cluster collection.(3) Request query. We implement a lightweight Web service search. Requests for Web services firstly search the memo DB module to determine if it contains the Web service request record. If so, the memo DB module returns results directly. Otherwise, we need to find the most relevant Web service cluster and get the Web services that meet the similarity threshold in the cluster as the Web service discovery result.(4) System realization and experiment evolution. We complete the semantic Web service discovery system based on LDA clustering and carry out experiments with OWLS-TC4 and hRESTS-TC3_release2 data sets (contains 1083 services and 42 queries). Then we compare it with the existing work in Precision, Recall, F-measure and efficiency. The experiment results show that the Precision, Recall and F-measure of our system respectively are 13.52%、37.37% and 30.47% above the tranditional VSM method based on TFIDF. In addition, significant validations are performed on the Precision, Recall and F-measure, which indicate that our system is effective on other Web service request. And we also come up with a concrete example to illustrate the whole Web service discovery process.
Keywords/Search Tags:Web service discovery, Latent Dirichlet Allocation, Clustering, Semantic Web service
PDF Full Text Request
Related items