Practical Bounds of Kullback-Leibler Divergence Using Maximum Mean Discrepancy
Estimating Kullback Leibler (KL) divergence from data samples is a strenuous task, with existing approaches either imposing restrictive assumptions on the domain knowledge of the underlying model or requiring density approximation like space partitioning. The use of kernel maximum mean discrepancy (MMD) yields an alternative non-parametric approach to compare two populations by finding the maximum mean difference generated by measurable functions in the unit ball in a reproducing kernel Hilbert space (RKHS). Predicated on the universal approximation of universal kernels, we propose two corresponding classes of functions in a RKHS that can bound the KL divergence from below and above, and derive the RKHS representations of these bounds. This allows us to develop asymptotically consistent estimates for the derived bounds. We evaluate the proposed bounds as mutual information proxies on an image dataset, and demonstrate that these bounds can stably track variations in mutual information.
READ FULL TEXT