Adversarial Stacked Auto-Encoders for Fair Representation Learning
Training machine learning models with the only accuracy as a final goal may promote prejudices and discriminatory behaviors embedded in the data. One solution is to learn latent representations that fulfill specific fairness metrics. Different types of learning methods are employed to map data into the fair representational space. The main purpose is to learn a latent representation of data that scores well on a fairness metric while maintaining the usability for the downstream task. In this paper, we propose a new fair representation learning approach that leverages different levels of representation of data to tighten the fairness bounds of the learned representation. Our results show that stacking different auto-encoders and enforcing fairness at different latent spaces result in an improvement of fairness compared to other existing approaches.
READ FULL TEXT