We study the effect of baselines in on-policy stochastic policy gradient...
Bootstrapping is behind much of the successes of Deep Reinforcement Lear...
This work revisits the use of information criteria to characterize the
g...
It has been postulated that a good representation is one that disentangl...
It has been postulated that a good representation is one that disentangl...
Finding features that disentangle the different causes of variation in r...