On Properties and Optimization of Information-theoretic Privacy Watchdog
We study the problem of privacy preservation in data sharing, where S is a sensitive variable to be protected and X is a non-sensitive useful variable correlated with S. Variable X is randomized into variable Y, which will be shared or released according to p_Y|X(y|x). We measure privacy leakage by information privacy (also known as log-lift in the literature), which guarantees mutual information privacy and differential privacy (DP). Let ⊆ contain elements n the alphabet of X for which the absolute value of log-lift (abs-log-lift for short) is greater than a desired threshold . When elements x∈ are randomized into y∈, we derive the best upper bound on the abs-log-lift across the resultant pairs (s,y). We then prove that this bound is achievable via an X-invariant randomization p(y|x) = R(y) for x,y∈. However, the utility measured by the mutual information I(X;Y) is severely damaged in imposing a strict upper bound on the abs-log-lift. To remedy this and inspired by the probabilistic (, δ)-DP, we propose a relaxed (, δ)-log-lift framework. To achieve this relaxation, we introduce a greedy algorithm which exempts some elements in from randomization, as long as their abs-log-lift is bounded by with probability 1-δ. Numerical results demonstrate efficacy of this algorithm in achieving a better privacy-utility tradeoff.
READ FULL TEXT