| With the rapid growth of information,the search of dense subgraphs can help people get useful information from large-scale graphs.In many practical applications,bipartite graphs are often used to model the relationship between two types of entities,and have a wide range of applications in different fields,such as e-commerce,social analysis,network services and bioinformatics.In real life,bipartite graphs often involve nodes with multidimensional attributes,which are usually represented by numerical values(movie score,user activity,etc.).However,most of the existing bipartite search algorithms either completely ignore the attribute value of the node or only consider one attribute value of the node,thus losing some useful information.Skyline computation,aiming at identifying a set of skyline points that are not dominated by any other point,is particularly useful for multi-criteria data analysis and decision making.Therefore,a set of nodes with both high connectivity(biclique)and optimal attribute values need to be retrieved according to some sorting functions.When no sorting function is specified,skyline will return the candidate object of the optimal object.In this paper,a new model,skyline(a,b)-biclique is proposed on the bipartite multi-valued attribute graph,in order to capture richer semantics in the bipartite multi-valued attribute graph,and obtain the optimal dense subgraph about the node attribute values.In this model,the structure cohesion of bipartite graph and the multi-dimensional attribute values of nodes are considered,and the number constraints of the two types of nodes in biclique are no less than a and b,respectively,and the(a,b)-biclique with the best node attribute values is selected by the dominant test of the size of the multidimensional attribute values of nodes by skyline properties.This paper proposes a baseline algorithm based on skyline(a,b)-biclique.The baseline algorithm is a dominant test algorithm based on biclique enumeration.Given parameters a and b,all the(a,b)-bicliques that meet the requirements are first searched out in the bipartite graph of multi-valued attributes,and the multi-dimensional node attribute values of different(a,b)-bicliques are then tested.Find the(a,b)-biclique with the best attribute value by comparison.Although the baseline algorithm provides a useful computing framework,its efficiency in calculating skyline(a,b)-bicliques is not high.In order to alleviate this problem,the algorithm is optimized for the main inspection part,and three optimization strategies are added to improve the efficiency,namely,pruning rule,verification rule and reduction rule.In order to support efficient query on bipartite graphs,an online query calculation algorithm based on index is proposed.The online algorithm queries all skylines(a,b)-bicliques.According to the results of skylines(a,b)-bicliques,a complete index I is designed,including nodes of each skyline(a,b)-bicliques.Although the query algorithm based on index I has high query efficiency,the spatial complexity of index I is very high.In order to strike a balance between the query efficiency and spatial complexity,a compact index I* is designed to effectively use space,and the computing sharing strategy is added to improve the efficiency of index construction process of index I*.On the five real data sets,a large number of experimental results show that the computational efficiency of the baseline algorithm based on the optimization strategy has been improved by more than 70% compared with that of the baseline algorithm,and the query efficiency of the online algorithm based on the complete index is more than 100 times faster than that of the baseline algorithm.The compact index-based algorithm that uses the balance of time and space can be more than 10 times faster than the baseline algorithm.In the case study of real data set,it is proved that the skyline(a,b)-biclique model proposed in this paper can more accurately search the dense subgraph containing the best candidate nodes than the biclique model.The experimental results prove the effectiveness and efficiency of the algorithm. |