This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
invariant_representation [2018/08/18 12:57]
invariant_representation [2018/10/02 20:33]
Line 411: Line 411:
 We argue that invariances should instead be incorporated in the model structure, and learned using the marginal likelihood, which correctly rewards the reduced complexity of invariant models. We argue that invariances should instead be incorporated in the model structure, and learned using the marginal likelihood, which correctly rewards the reduced complexity of invariant models.
 +https://​arxiv.org/​abs/​1706.01350 Emergence of Invariance and Disentanglement in Deep Representations
 +We propose regularizing the loss by bounding such a term in two equivalent ways: One with a Kullbach-Leibler term, which relates to a PAC-Bayes perspective;​ the other using the information in the weights as a measure of complexity of a learned model, yielding a novel Information Bottleneck for the weights. Finally, we show that invariance and independence of the components of the representation learned by the network are bounded above and below by the information in the weights, and therefore are implicitly optimized during training. The theory enables us to quantify and predict sharp phase transitions between underfitting and overfitting of random labels when using our regularized loss, which we verify in experiments,​ and sheds light on the relation between the geometry of the loss function, invariance properties of the learned representation,​ and generalization error.
 +https://​arxiv.org/​pdf/​1809.02601v1.pdf Accelerating Deep Neural Networks with Spatial Bottleneck Modules
 +This paper presents an efficient module named spatial
 +bottleneck for accelerating the convolutional layers in deep
 +neural networks. The core idea is to decompose convolution
 +into two stages, which first reduce the spatial resolution
 +of the feature map, and then restore it to the desired size.
 +This operation decreases the sampling density in the spatial
 +domain, which is independent yet complementary to
 +network acceleration approaches in the channel domain.
 +Using different sampling rates, we can tradeoff between
 +recognition accuracy and model complexity
 +https://​arxiv.org/​abs/​1809.02591v1 Learning Invariances for Policy Generalization
 +While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks. In this paper, we study a simple reinforcement learning problem and focus on learning policies that encode the proper invariances for generalization to different settings. We evaluate three potential methods for policy generalization:​ data augmentation,​ meta-learning and adversarial training. We find our data augmentation method to be effective, and study the potential of meta-learning and adversarial learning as alternative task-agnostic approaches. ​
 +https://​openreview.net/​forum?​id=Ske25sC9FQ Robustness and Equivariance of Neural Networks
 +Robustness to rotations comes at the cost of robustness of pixel-wise adversarial perturbations.
 +https://​arxiv.org/​abs/​1809.10083v1 Unsupervised Adversarial Invariance
 +We present a novel unsupervised invariance induction framework for neural networks that learns a split representation of data through competitive training between the prediction task and a reconstruction task coupled with disentanglement,​ without needing any labeled information about nuisance factors or domain knowledge. We describe an adversarial instantiation of this framework and provide analysis of its working. Our unsupervised model outperforms state-of-the-art methods, which are supervised, at inducing invariance to inherent nuisance factors, effectively using synthetic data augmentation to learn invariance, and domain adaptation. Our method can be applied to any prediction task, eg., binary/​multi-class classification or regression, without loss of generality.
 +disentanglement is achieved between e1 and e2 in a novel way through two adversarial disentanglers
 +— one that aims to predict e2 from e1 and another that does the inverse.