Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
random_orthogonal_initialization [2018/04/12 13:51]
admin
random_orthogonal_initialization [2018/11/04 15:14]
admin
Line 369: Line 369:
  
 The objective of transfer reinforcement learning is to generalize from a set of previous tasks to unseen new tasks. In this work, we focus on the transfer scenario where the dynamics among tasks are the same, but their goals differ. Although general value function (Sutton et al., 2011) has been shown to be useful for knowledge transfer, learning a universal value function can be challenging in practice. To attack this, we propose (1) to use universal successor representations (USR) to represent the transferable knowledge and (2) a USR approximator (USRA) that can be trained by interacting with the environment. Our experiments show that USR can be effectively applied to new tasks, and the agent initialized by the trained USRA can achieve the goal considerably faster than random initialization. The objective of transfer reinforcement learning is to generalize from a set of previous tasks to unseen new tasks. In this work, we focus on the transfer scenario where the dynamics among tasks are the same, but their goals differ. Although general value function (Sutton et al., 2011) has been shown to be useful for knowledge transfer, learning a universal value function can be challenging in practice. To attack this, we propose (1) to use universal successor representations (USR) to represent the transferable knowledge and (2) a USR approximator (USRA) that can be trained by interacting with the environment. Our experiments show that USR can be effectively applied to new tasks, and the agent initialized by the trained USRA can achieve the goal considerably faster than random initialization.
 +
 +https://​arxiv.org/​abs/​1806.10909 ResNet with one-neuron hidden layers is a Universal Approximator
 +
 +https://​arxiv.org/​pdf/​1809.08836.pdf Dense neural networks as sparse graphs and the
 +lightning initialization
 +
 +