On the Intrinsic Privacy of Stochastic Gradient Descent
Private learning algorithms have been proposed that ensure strong differential-privacy (DP) guarantees, however they often come at a cost to utility. Meanwhile, stochastic gradient descent (SGD) contains intrinsic randomness which has not been leveraged for privacy. In this work, we take the first step towards analysing the intrinsic privacy properties of SGD. Our primary contribution is a large-scale empirical analysis of SGD on convex and non-convex objectives. We evaluate the inherent variability due to the stochasticity in SGD on 3 datasets and calculate the ϵ values due to the intrinsic noise. First, we show that the variability in model parameters due to the random sampling almost always exceeds that due to changes in the data. We observe that SGD provides intrinsic ϵ values of 7.8, 6.9, and 2.8 on MNIST, Adult, and Forest Covertype datasets respectively. Next, we propose a method to augment the intrinsic noise of SGD to achieve the desired ϵ. Our augmented SGD outputs models that outperform existing approaches with the same privacy guarantee, closing the gap to noiseless utility between 0.19 theoretical bound on the sensitivity of SGD is not tight. By estimating the tightest bound empirically, we achieve near-noiseless performance at ϵ = 1, closing the utility gap to the noiseless model between 3.13 Our experiments provide concrete evidence that changing the seed in SGD is likely to have a far greater impact on the model than excluding any given training example. By accounting for this intrinsic randomness, higher utility can be achieved without sacrificing further privacy. With these results, we hope to inspire the research community to further explore and characterise the randomness in SGD, its impact on privacy, and the parallels with generalisation in machine learning.
READ FULL TEXT