Font Size: a A A

Research On Tag Recommendation Methods For Software Information Site

Posted on:2020-10-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:P Y ZhouFull Text:PDF
GTID:1368330590454124Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Today,software development engineers widely use various types of online information platforms to search solutions,share development experience,open source projects,learn new software development skills and answer questions for other developers.In the whole life cycle of software development,these online information platforms can provide software development engineers with a variety of useful information and help to improve the level of software development,so these online information platforms are also known as software information sites.The contents posted on these software information sites are called softare objects.As software information sites play an increasingly important role in software development,various types of software information sites have attracted extensive attention from academia and industry.With the continuous evolution of software information site,the number of software objects in site is growing rapidly.This makes it very difficult for developers to locate a specific software object quickly.Therefore,a classic practice that has been widely used in social media platforms and web communities is the introduction of tags as a lightweight management mechanism.Tags provide an external metadata for various web objects.Tags as an effective lightweight computing mechanism has also been widely introduced into various types of software information sites.In these software information sites,tags are used to search,identify,classify and organize software objects.The introduction of tags also bridges the social and technical gap and facilitates collaboration among developers.So,software information sites often require developer to tag the software objects they posting.High-quality tags are concise and can describe the most important features of software objects.Thus,the quality of tagging software objects by developer is very important for software information sites.However,the tagging of software objects by developers is inherently a distributed and uncoordinated process,each developer is free to choose tags that are deemed most appropriate for a software object.Meanwhile,most software information sites allow developers to tag their software objects with their own words.Tagging software objects is very easy and flexible,but it brings some problems.Firstly,the number of different tags grows rapidly along with continuous addition of softare objects.When a developer chooses tag from these existing tags in the software information site to tag his software object,how to find the appropriate tags from the large-scale tags.Futhermore,developers are allowed to create and choose tags freely,which leads to more tag synonymous tags in software information site.These problems makes some software objects in the software information site become more and more poorly tagged.In order to better tag new content,effectively reuse existing tags,efficiently manage the growth of tags in a software information site,and help developer quickly locate and recommend appropriate tags,how to build an automatic and efficient tag recommendation system for software information site has become an important research problem in the field of software engineering.Around this research problem,this paper carries out the following four main research work.(1)In order to make the tag recommendation system adapt to the dynamic changes of the software information site and respond quicly in the face of large-scale software information site and tags,we first propose a tag recommendation method TagMulRec based on software text content search and learning strategy.TagMulRec firstly constructs indices for the text content of all software objects in the software information site.For a new software object,TagMulRec then uses the indices to retrieve some software object with high semantic similary to the text content of the new software object to construct target candidate set.Based on the target candidate set,TagMulRec further utilizes the software object tag recommendation algorithm based on semnantic similarity to recommend K tags for the new software object.(2)In order to make full of use of these information in the software information site,further improve the accuracy and the service response time of tag recommendation method,we propose a tag recommendation method FastTagRec based on shallow learning text content of software objects.FastTagRec is based on the single hidden layer neural network model,which has simple structure and can quickly train the mode and recommend tags.FastTagRec first utilizes the shared parmeter matrix in the model input layer to make full use of almost all the text content and tags informaion of software objects in the software information site.In order to further improve the model accuracy,FastTagRec introduces text conditional constraints in the model input layer to capture the dependencies between words in the text content of software objects.(3)Inspired by the successful application of deep learning to other research issues in the field of software engineering and in order to further improve the accuracy of tag recommendation method,we propose four different tag recommendation approach based on deep learning text content of software objects – TagCNN,TagRNN,TagHAN and TagRCNN – and compare their efficiency and effectiveness against three non-deep learning methods – EnTagRec,TagMulRec,and FastTagRec.Our comprehensive experiments demonstrate that the appropriate deep learning model structure can improve the service accuracy,but the non-deep learning method still has advantages in service speed.(4)The content of software object in software information site mainly includes code content and text content.In order to make full use of these content of software object and the recommended tags list for the software object can reflect the personal preferenes of developer who post the software object,we propose a personalized tag recommendation method Per-TagBHNN that combines text and code context of software objects.Firstly,Per-TagBHNN utilizes TagBHNN method to establish a tag recommendation model based on text content and code content of software objects.Secondly,Per-TagBHNN is added the developer model that can reflect the developer’s preference for tag usage and topic of interest.Per-TagBHNN utilizes the developer model to reorder the tag list recommended by the TagBHNN.In the rearranged tag list,tags that are highly relevant to the content of software object rank higher than other and developer preference tags or common tags under developer preference interest topics also achieve a relatively high ranking.In this paper,we collected data from 10 famous software information sites to build experimental datasets.The effectiveness of these proposed approaches is verified by a series of experiments.
Keywords/Search Tags:software information site, software object, tag recommendation, multi-label classification, data analysis
PDF Full Text Request
Related items