Two Party Distribution Testing: Communication and Security
We study the problem of discrete distribution testing in the two-party setting. For example, in the standard closeness testing problem, Alice and Bob each have t samples from, respectively, distributions a and b over [n], and they need to test whether a=b or a,b are ϵ-far for some fixed ϵ>0. This is in contrast to the well-studied one-party case, where the tester has unrestricted access to samples of both distributions, for which optimal bounds are known for a number of variations. Despite being a natural constraint in applications, the two-party setting has evaded attention so far. We address two fundamental aspects: 1) what is the communication complexity, and 2) can it be accomplished securely, without Alice and Bob learning extra information about each other's input. Besides closeness testing, we also study the independence testing problem, where Alice and Bob have t samples from distributions a and b respectively, which may be correlated; the question is whether a,b are independent of ϵ-far from being independent. Our contribution is three-fold: 1) Communication: we show how to gain communication efficiency with more samples, beyond the information-theoretic bound on t. The gain is polynomially better than what one obtains by adapting standard algorithms. 2) Lower bounds: we prove tightness of our protocols for the closeness testing, and for the independence testing when the number of samples is unbounded. These lower bounds are of independent interest as these are the first 2-party communication lower bounds for testing problems. 3) Security: we define secure distribution testing and argue that it must leak at least some minimal information. We then provide secure versions of the above protocols with an overhead that is only polynomial in the security parameter.
READ FULL TEXT