Interactive recommender systems (RSs) allow users to express intent,
pre...
The development of recommender systems that optimize multi-turn interact...
Efficient exploration in multi-armed bandits is a fundamental online lea...
We study a contextual bandit setting where the learning agent has access...
We learn bandit policies that maximize the average reward over bandit
in...
We propose RecSim, a configurable platform for authoring simulation
envi...
The prevalent approach to bandit algorithm design is to have a low-regre...