Impact of Data Processing on Fairness in Supervised Learning
We study the impact of pre and post processing for reducing discrimination in data-driven decision makers. We first analyze the fundamental trade-off between fairness and accuracy in a pre-processing approach, and propose a design for a pre-processing module based on a convex optimization program, which can be added before the original classifier. This leads to a fundamental lower bound on attainable discrimination, given any acceptable distortion in the outcome. Furthermore, we reformulate an existing post-processing method in terms of our accuracy and fairness measures, which allows comparing post-processing and pre-processing approaches. We show that under some mild conditions, pre-processing outperforms post-processing. Finally, we show that by appropriate choice of the discrimination measure, the optimization problem for both pre and post processing approaches will reduce to a linear program and hence can be solved efficiently.
READ FULL TEXT