Font Size: a A A

Dynamic Semi-supervised Classification Algorithms On Network Data

Posted on:2014-12-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y ShiFull Text:PDF
GTID:1220330452453598Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Semi-supervised classification algorithms of network data are crucial in large-scaledata analysis. It is significant in many fields of applications, such as social network, cita-tion network and web data network. Intensive research on semi-supervised classificationof undirected network has been done in recent years. However, studies on the directednetwork are few because of the essential difculties in their modeling.Labeled data could only be a small proportion in the massive datasets. Hence, semi-supervised algorithms are inevitable choice. Based on molecular dynamic method, weproposed dynamic semi-supervised algorithms for directed network. Magazine citationweb is used as the dataset to describe the algorithms in this thesis. Magazines can beregarded as a weight directed network (or directed graph) by citations. Part of the mag-azines with categories can be marked with labels and can be taken as seed nodes. Theforce between nodes can be naturely defined by the citations. The nodes move under theforce and go close to the seed nodes with the similar feature.We hope that non-seed nodescan reach steady-state or dynamic steady-state near the seed nodes as the final state of theprocess.The selection of seed nodes is the key in semi-supervised algorithms. We proposea hierarchical semi-supervised dynamic classification algorithm by iteratively seed re-newing to improve the classification result. This leads to a non-convex quadratic con-strained quadratic programming problem. We give a global optimality condition of KKTpoints and propose a local search scheme with preprocessing to solve some sub-classesof quadratic programming problems that satisfy some conditions. Numerical examplesshow the efectiveness of this scheme.In order to describe classification results efciently, a visualization method is de-veloped to demonstrate the classification results. Since the classification results of manymagazines are similar, it is not easy to identify each magazines. The magazines are rank-related because of the directed edges. Hence, we combined the classification results withpersonalized PageRank of directed network and show the classification results.The main innovative contributions of this thesis are summarized as follows.A dynamic semi-supervised algorithm is proposed based on molecular dynamicmethod. Using the ranking index of the magazines, the classification results are visualized.A novel global optimality condition of quadratic programming is proposed. Weconstruct the equivalence relation between the Lagrangian multipliers of quadraticprogramming problems and conic programming problems over cones of nonnega-tive quadratic functions and propose Local Search Scheme with Preprocessing tosolve some sub-classes of quadratic programming problems.A hierarchical dynamic semi-supervised classification algorithm is developed. Bysolving a variant forms of canonical correlation analysis, seed nodes are updated inthe dynamic semi-supervised algorithm on directed network data.
Keywords/Search Tags:Directed Network, Semi-supervised Classification, Dynamic Model, Quadratic Programming
PDF Full Text Request
Related items