Strengthened Information-theoretic Bounds on the Generalization Error
The following problem is considered: given a joint distribution P_XY and an event E, bound P_XY(E) in terms of P_XP_Y(E) (where P_XP_Y is the product of the marginals of P_XY) and a measure of dependence of X and Y. Such bounds have direct applications in the analysis of the generalization error of learning algorithms, where E represents a large error event and the measure of dependence controls the degree of overfitting. Herein, bounds are demonstrated using several information-theoretic metrics, in particular: mutual information, lautum information, maximal leakage, and J_∞. The mutual information bound can outperform comparable bounds in the literature by an arbitrarily large factor.
READ FULL TEXT