| With the development of artificial intelligence,using machine learning methods to make decisions on social issues has become a trend.However,there are potential ethical and security risks in algorithmic decision-making,especially regarding algorithmic fairness.The issue of algorithmic fairness refers to the decision-making model inferring the inherent sensitive attributes of individuals or groups as factors,and the algorithm outputs results that are disadvantageous to some sensitive attribute groups.As a method of data feature extraction and transformation,representation learning can learn data’s unbiased representations to ensure fair output of downstream tasks.However,existing fair representation models have the problems of opaque mechanism and poor interpretability,which can not directly show the correlation between representation vector and sensitive attribute vector.In addition,the distance optimization method adopted to achieve algorithmic fairness by reducing the difference of representation distribution among sensitive attribute groups is complex and unintuitive.Existing methods ignore the discussion on the difference of representation characteristics.In the aspect of fairness measurement,the existing research on fairness representation lacks a model to achieve accuracy parity while maintaining the original joint accuracy.To address the above issues,this paper combines the analysis of the correlation between representation vectors and the constraint of representation feature to propose the following three research contents:(1)A fair representation learning method based on separating latent sensitive information is proposed to address the transparency problem of existing fair representation algorithms.Through information theory analysis of the representation data and sensitive information data,the correlation between the independent representation vectors and the sensitive attribute vectors is constrained to separate the latent sensitive information in the representation.The generated representation vectors of the model on this basis all intuitively reduce or increase the correlation with a certain sensitive information.Experiments show that the proposed model generates better fairness-accuracy trade-offs than previous fair models.(2)A fair classification method based on balanced representation’s characteristic is proposed to address the problem of complex distribution optimization of the representation.The proposed constraints can make groups with the same label but different sensitive attributes generate similar representations.The use of weighted training and alternating training can optimize both the representation model and a classification model,enabling the classification model to output fair results.(3)A fair representation learning algorithm towards accuracy parity is proposed to address the problem of accuracy disparity in the classification model.Based on balanced representation characteristic,the relationship between model accuracy and sensitive information is analyzed.The cross-entropy loss is used as the optimization function of the classification model,and the calculated cross-entropy error vector is used to constrain its correlation with the sensitive information vector.Experiments show that the proposed model can make downstream classifiers output accuracy parity results.For the methods proposed in the research content mentioned above,we conducted corresponding experiments to verify the effectiveness of the proposed algorithms.We also compared with other related algorithms to demonstrate the superiority of our model. |