Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models

05/09/2019
by   Aurélien Garivier, et al.
0

In this paper, we study sequential testing problems with overlapping hypotheses. We first focus on the simple problem of assessing if the mean μ of a Gaussian distribution is ≥ -ϵ or ≤ϵ; if μ∈(-ϵ,ϵ), both answers are considered to be correct. Then, we consider PAC-best arm identification in a bandit model: given K probability distributions on R with means μ_1,...,μ_K, we derive the asymptotic complexity of identifying, with risk at most δ, an index I∈{1,...,K} such that μ_I≥_iμ_i -ϵ. We provide non asymptotic bounds on the error of a parallel General Likelihood Ratio Test, which can also be used for more general testing problems. We further propose lower bound on the number of observation needed to identify a correct hypothesis. Those lower bounds rely on information-theoretic arguments, and specifically on two versions of a change of measure lemma (a high-level form, and a low-level form) whose relative merits are discussed.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset