In recent years, Artificial Intelligence (AI) systems have surpassed hum...
Resource scheduling and allocation is a critical component of many high
...
Realistic environments often provide agents with very limited feedback. ...
Reward-free exploration is a reinforcement learning setting recently stu...
We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algo...
We develop a framework for the adaptive model predictive control of a li...
We study the design of learning architectures for behavioural planning i...
We consider the problem of online planning in a Markov Decision Process ...
Can we learn a control policy able to adapt its behaviour in real time s...
This work studies the design of safe control policies for large-scale
no...