Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
hierarchical_abstraction [2017/08/13 11:18]
127.0.0.1 external edit
hierarchical_abstraction [2018/07/23 17:05]
admin
Line 447: Line 447:
  
 We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights. Our results reveal an order-to-chaos expressivity phase transition, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth but not width. We prove this generic class of deep random functions cannot be efficiently computed by any shallow network, going beyond prior work restricted to the analysis of single functions. Moreover, we formalize and quantitatively demonstrate the long conjectured idea that deep networks can disentangle highly curved manifolds in input space into flat manifolds in hidden space. Our theoretical analysis of the expressive power of deep networks broadly applies to arbitrary nonlinearities,​ and provides a quantitative underpinning for previously abstract notions about the geometry of deep functions. We combine Riemannian geometry with the mean field theory of high dimensional chaos to study the nature of signal propagation in generic, deep neural networks with random weights. Our results reveal an order-to-chaos expressivity phase transition, with networks in the chaotic phase computing nonlinear functions whose global curvature grows exponentially with depth but not width. We prove this generic class of deep random functions cannot be efficiently computed by any shallow network, going beyond prior work restricted to the analysis of single functions. Moreover, we formalize and quantitatively demonstrate the long conjectured idea that deep networks can disentangle highly curved manifolds in input space into flat manifolds in hidden space. Our theoretical analysis of the expressive power of deep networks broadly applies to arbitrary nonlinearities,​ and provides a quantitative underpinning for previously abstract notions about the geometry of deep functions.
 +
 +https://​arxiv.org/​abs/​1802.01071 Hierarchical Adversarially Learned Inference
 +
 +We show that the features learned by our model in an unsupervised way outperform the best handcrafted features
 +
 +https://​arxiv.org/​pdf/​1802.04473.pdf Information Scaling Law of Deep Neural Networks
 +
 +With
 +the typical DNNs named Convolutional Arithmetic
 +Circuits (ConvACs), the complex DNNs can be
 +converted into mathematical formula. Thus, we can
 +use rigorous mathematical theory especially the information
 +theory to analyse the complicated DNNs.
 +In this paper, we propose a novel information scaling
 +law scheme that can interpret the network’s inner
 +organization by information theory. First, we
 +show the informational interpretation of the activation
 +function. Secondly, we prove that the information
 +entropy increases when the information
 +is transmitted through the ConvACs. Finally, we
 +propose the information scaling law of ConvACs
 +through making a reasonable assumption.
 +
 +https://​arxiv.org/​abs/​1712.00409 Deep Learning Scaling is Predictable,​ Empirically
 +
 +https://​arxiv.org/​pdf/​1804.02808v1.pdf Latent Space Policies for Hierarchical Reinforcement Learning
 +
 +First, each layer in the
 +hierarchy can be trained with exactly the same algorithm.
 +Second, by using an invertible mapping from latent variables
 +to actions, each layer becomes invertible, which means that
 +the higher layer can always perfectly invert any behavior of
 +the lower layer. This makes it possible to train lower layers
 +on heuristic shaping rewards, while higher layers can still
 +optimize task-specific rewards with good asymptotic performance.
 +Finally, our method has a natural interpretation
 +as an iterative procedure for constructing graphical models
 +that gradually simplify the task dynamics.
 +
 +https://​openreview.net/​forum?​id=S1JHhv6TW Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions ​
 +
 +We focus on dilated convolutional networks, a family of deep models delivering state of the art performance in sequence processing tasks. By introducing and analyzing the concept of mixed tensor decompositions,​ we prove that interconnecting dilated convolutional networks can lead to expressive efficiency. In particular, we show that even a single connection between intermediate layers can already lead to an almost quadratic gap, which in large-scale settings typically makes the difference between a model that is practical and one that is not
 +
 +https://​arxiv.org/​abs/​1807.04640v1 Automatically Composing Representation Transformations as a Means for Generalization
 +
 +https://​arxiv.org/​abs/​1807.07560v1 Compositional GAN: Learning Conditional Image Composition
 +