Practical Risk Measures in Reinforcement Learning
Practical application of Reinforcement Learning (RL) often involves risk considerations. We study a generalized approximation scheme for risk measures, based on Monte-Carlo simulations, where the risk measures need not necessarily be coherent. We demonstrate that, even in simple problems, measures such as the variance of the reward-to-go do not capture the risk in a satisfactory manner. In addition, we show how a risk measure can be derived from model's realizations. We propose a neural architecture for estimating the risk and suggest the risk critic architecture that can be use to optimize a policy under general risk measures. We conclude our work with experiments that demonstrate the efficacy of our approach.
READ FULL TEXT