Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
ensembles [2017/11/02 11:55]
admin
ensembles [2018/04/29 13:56] (current)
admin
Line 255: Line 255:
 https://​arxiv.org/​pdf/​1709.06053v1.pdf COUPLED ENSEMBLES OF NEURAL NETWORKS https://​arxiv.org/​pdf/​1709.06053v1.pdf COUPLED ENSEMBLES OF NEURAL NETWORKS
  
-https://​openreview.net/​pdf?​id=SyZipzbCb ​+https://​openreview.net/​pdf?​id=SyZipzbCb ​DISTRIBUTIONAL POLICY GRADIENTS 
 + 
 +https://​arxiv.org/​pdf/​1711.03953.pdf BREAKING THE SOFTMAX BOTTLENECK: A HIGH-RANK RNN LANGUAGE MODEL 
 + 
 +We formulate language modeling as a matrix factorization problem, and show that the expressiveness of Softmax-based models (including the majority of neural language models) is limited by a Softmax bottleneck. Given that natural language is highly context-dependent,​ this further implies that in practice Softmax with distributed word embeddings does not have enough capacity to model natural language. 
 + 
 +Specifically,​ we introduce discrete latent variables into a recurrent language model, and formulate the next-token probability distribution as a Mixture of Softmaxes (MoS). Mixture of Softmaxes is more expressive than Softmax and other surrogates considered in prior work. Moreover, we show that MoS learns matrices that have much larger normalized singular values and thus much higher rank than Softmax and other baselines on real-world datasets. 
 + 
 +https://​arxiv.org/​abs/​1712.06560v1 Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents 
 + 
 + NS-ES and a version of QD we call NSR-ES, avoid local optima encountered by ES to achieve higher performance on tasks ranging from playing Atari to simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration. It also adds this new family of exploration algorithms to the RL toolbox and raises the interesting possibility that analogous algorithms with multiple simultaneous paths of exploration might also combine well with existing RL algorithms outside ES. 
 + 
 +https://​arxiv.org/​abs/​1802.06070 Diversity is All You Need: Learning Skills without a Reward Function 
 + 
 + Our results suggest that unsupervised discovery of skills can serve as an effective pretraining mechanism for overcoming challenges of exploration and data efficiency in reinforcement learning 
 + 
 +https://​arxiv.org/​pdf/​1804.08328v1.pdf Taskonomy: Disentangling Task Transfer Learning ​