Estimation and Concentration of Missing Mass of Functions of Discrete Probability Distributions
Given a positive function g from [0,1] to the reals, the function's missing mass in a sequence of iid samples, defined as the sum of g(pr(x)) over the missing letters x, is introduced and studied. The missing mass of a function generalizes the classical missing mass, and has several interesting connections to other related estimation problems. Minimax estimation is studied for order-α missing mass (g(p)=p^α) for both integer and non-integer values of α. Exact minimax convergence rates are obtained for the integer case. Concentration is studied for a class of functions and specific results are derived for order-α missing mass and missing Shannon entropy (g(p)=-plog p). Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.
READ FULL TEXT