Font Size: a A A

Research On Approximate Algorithm Of K-center Problem With Outliers And Fairness Constraints

Posted on:2023-07-20Degree:MasterType:Thesis
Country:ChinaCandidate:X C LingFull Text:PDF
GTID:2558307070483944Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the Internet era of the 21 st century,the amount of data is growing explosively,and the research of big data analysis is imminent.Clustering problem is a branch of data analysis problem,which has attracted extensive attention.Among many formal models describing clustering problems,k-center problem is a common problem of clustering.This problem aims to divide the unlabeled data set into kclusters.The similarity of data in the same clusters is high and the similarity of data in different clusters is low.In the practical application of data analysis,data analysis often needs to be limited.For example,clustering problem with outliers,fair clustering problem,capacity clustering problem,etc.These clustering problems with constraints are called constrained clustering problems.This paper mainly discusses the k-center problem with outliers and the fair k-center problem with outliers.These two problems are NP hard problems and belong to constrained k-center problems.In this paper,solutions based on approximation algorithm are proposed for these two problems.(1)This paper studies the k-center problem with outliers,and designs a double standard approximation algorithm.The approximation algorithm introduces the idea of parameter algorithm and adds a fixed parameter kto reduce the time complexity and find the exact kcenter points.Firstly,some data points are randomly sampled and added to the candidate center point set.Then,the candidate center set is iterated,and some data points farthest from the candidate center point set are found in each round of iteration.Some points are randomly sampled from these data points and added to the candidate center point set.Finally,the candidate center point sets are enumerated,kcenter points are reserved as the center point set,and other data points are assigned to the center point closest to them.The algorithm can effectively deal with the k-center problem with outliers,and find the approximate solution of the k-center problem while removing the outliers.The approximate ratio is 3.Moreover,on reasonable assumptions,the algorithm can achieve single standard approximation.The time complexity of the algorithm is linear with respect to nwhen the value ofkis fixed.(2)This paper studies the fair k-center problem with outliers.In recent years,many researchers are committed to the fairness constraints of data analysis.Based on the limitations of fairness and outliers which are point data analysis limitations,this paper first proposes the fair k-center problem with outliers,and designs an approximate algorithm.Firstly,the algorithm uses greedy idea to find the initial solution of a k-center problem to remove outliers.Then,the center points are fixed,and the data points are redistributed near the center point by linear programming to meet the fairness restrictions.The fractional solution of a fair k-center problem with outliers is obtained.Finally,the linear programming rounding technique is used to rounding the fractional solution into an integer solution.In this paper,the first approximation algorithm with fair violation degree is proposed for the fair k-center problem with outliers.The approximation ratio is 5 and the violation degree is 4Δ+3+2logn.There are 10 figures,4 tables and 107 references in this thesis.
Keywords/Search Tags:Outliers, The k-center Problem, Approximation Algorithm, Fairness Constraints
PDF Full Text Request
Related items