We focus on the problem of uncertainty informed allocation of medical
re...
In most applications of model-based Markov decision processes, the param...
Policy gradient algorithms have been widely applied to reinforcement lea...
In this paper we consider the problem of best-arm identification in
mult...
Due to communication constraints and intermittent client availability in...
Error-Correcting Output Codes (ECOCs) offer a principled approach for
co...
We consider a multi-armed bandit framework where the rewards obtained by...
We consider a correlated multi-armed bandit problem in which rewards of ...
We consider a novel multi-armed bandit framework where the rewards obtai...
This paper studies the problem of learning the probability distribution...