Learning and Covering Sums of Independent Random Variables with Unbounded Support
We study the problem of covering and learning sums X = X_1 + ⋯ + X_n of independent integer-valued random variables X_i (SIIRVs) with unbounded, or even infinite, support. De et al. at FOCS 2018, showed that the maximum value of the collective support of X_i's necessarily appears in the sample complexity of learning X. In this work, we address two questions: (i) Are there general families of SIIRVs with unbounded support that can be learned with sample complexity independent of both n and the maximal element of the support? (ii) Are there general families of SIIRVs with unbounded support that admit proper sparse covers in total variation distance? As for question (i), we provide a set of simple conditions that allow the unbounded SIIRV to be learned with complexity poly(1/ϵ) bypassing the aforementioned lower bound. We further address question (ii) in the general setting where each variable X_i has unimodal probability mass function and is a different member of some, possibly multi-parameter, exponential family ℰ that satisfies some structural properties. These properties allow ℰ to contain heavy tailed and non log-concave distributions. Moreover, we show that for every ϵ > 0, and every k-parameter family ℰ that satisfies some structural assumptions, there exists an algorithm with Õ(k) ·poly(1/ϵ) samples that learns a sum of n arbitrary members of ℰ within ϵ in TV distance. The output of the learning algorithm is also a sum of random variables whose distribution lies in the family ℰ. En route, we prove that any discrete unimodal exponential family with bounded constant-degree central moments can be approximated by the family corresponding to a bounded subset of the initial (unbounded) parameter space.
READ FULL TEXT