Efficient Latent Representations using Multiple Tasks for Autonomous Driving
Driving in the dynamic, multi-agent, and complex urban environment is a difficult task requiring a complex decision policy. The learning of such a policy requires a state representation that can encode the entire environment. Mid-level representations that encode a vehicle's environment as images have become a popular choice, but they are quite high-dimensional, which limits their use in data-scarce cases such as reinforcement learning. In this article, we propose to learn a low dimensional and rich feature representation of the environment by training an encoder-decoder deep neural network to predict multiple application relevant factors such as trajectories of other agents. We demonstrate that the use of the multi-head encoder-decoder neural network results in a more informative representation compared to a single-head encoder-decoder model. In particular, the proposed representation learning approach helps the policy network to learn faster, with increased performance and with less data, compared to existing approaches using a single-head network.
READ FULL TEXT