Correlated Data in Differential Privacy: Definition and Analysis
Differential privacy is a rigorous mathematical framework for evaluating and protecting data privacy. In most existing studies, there is a vulnerable assumption that records in a dataset are independent when differential privacy is applied. However, in real-world datasets, records are likely to be correlated, which may lead to unexpected data leakage. In this survey, we investigate the issue of privacy loss due to data correlation under differential privacy models. Roughly, we classify existing literature into three lines: 1) using parameters to describe data correlation in differential privacy, 2) using models to describe data correlation in differential privacy, and 3) describing data correlation based on the framework of Pufferfish. Firstly, a detailed example is given to illustrate the issue of privacy leakage on correlated data in real scenes. Then our main work is to analyze and compare these methods, and evaluate situations that these diverse studies are applied. Finally, we propose some future challenges on correlated differential privacy.
READ FULL TEXT