**This is an old revision of the document!**

https://arxiv.org/pdf/1707.09219.pdf Recurrent Ladder Networks

We propose a recurrent extension of the Ladder network [24], which is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and temporal modeling.

https://arxiv.org/abs/1703.01560 LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation https://github.com/jwyang/lr-gan.pytorch

https://medium.com/towards-data-science/a-new-kind-of-deep-neural-networks-749bcde19108

http://www.fil.ion.ucl.ac.uk/spm/doc/papers/Reinforcement_Learning_or_Active_Inference.pdf

In summary, the free-energy formulation dispenses with value functions and prescribes optimal trajectories in terms of prior expectations. Active inference ensures these trajectories are followed, even under random perturbations. In what sense are priors optimal? They are optimal in the sense that they restrict the states of an agent to a small part of state-space. In this formulation, rewards do not attract trajectories; rewards are just sensory states that are visited frequently. If we want to change the behaviour of an agent in a social or experimental setting, we simply induce new (empirical) priors by exposing the agent to a new environment. From the engineering perceptive, the ensuing behaviour is remarkably robust to noise and limited only by the specification of the new (controlled) environment.

https://arxiv.org/pdf/1503.04187.pdf A Minimal Active Inference Agent

http://ac.els-cdn.com/S0149763416307096/1-s2.0-S0149763416307096-main.pdf?_tid=8d4190c6-79ef-11e7-aa20-00000aab0f27&acdnat=1501945731_88bccc8a50ed1b8a2d2e2970b9797ee8 Deep temporal models and active inference

http://www.fil.ion.ucl.ac.uk/~karl/Active%20inference%20and%20learning.pdf Active inference and learning

https://arxiv.org/pdf/1706.00885v3.pdf IDK Cascades: Fast Deep Learning by Learning not to Overthink

We introduce the “I Don't Know” (IDK) prediction cascades framework, a general framework for composing a set of pre-trained models to accelerate inference without a loss in prediction accuracy. We propose two search based methods for constructing cascades as well as a new cost-aware objective within this framework. We evaluate these techniques on a range of both benchmark and real-world datasets and demonstrate that prediction cascades can reduce computation by 37%, resulting in up to 1.6x speedups in image classification tasks over state-of-the-art models without a loss in accuracy.

https://arxiv.org/pdf/1709.07432.pdf DYNAMIC EVALUATION OF NEURAL SEQUENCE MODELS

Dynamic evaluation methods continuously adapt the model parameters θg, learned at training time, to parts of a sequence during evaluation.

https://arxiv.org/abs/1706.04008v1 Recurrent Inference Machines for Solving Inverse Problems

Much of the recent research on solving iterative inference problems focuses on moving away from hand-chosen inference algorithms and towards learned inference. In the latter, the inference process is unrolled in time and interpreted as a recurrent neural network (RNN) which allows for joint learning of model and inference parameters with back-propagation through time. In this framework, the RNN architecture is directly derived from a hand-chosen inference algorithm, effectively limiting its capabilities. We propose a learning framework, called Recurrent Inference Machines (RIM), in which we turn algorithm construction the other way round: Given data and a task, train an RNN to learn an inference algorithm. Because RNNs are Turing complete [1, 2] they are capable to implement any inference algorithm. The framework allows for an abstraction which removes the need for domain knowledge. We demonstrate in several image restoration experiments that this abstraction is effective, allowing us to achieve state-of-the-art performance on image denoising and super-resolution tasks and superior across-task generalization.

https://arxiv.org/abs/1802.04762v1 Deep Predictive Coding Network for Object Recognition

PCN reuses a single architecture to recursively run bottom-up and top-down process, enabling an increasingly longer cascade of non-linear transformation. For image classification, PCN refines its representation over time towards more accurate and definitive recognition.

https://github.com/nyu-dl/dl4mt-nonauto Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement

https://arxiv.org/abs/1803.11189v1 Iterative Visual Reasoning Beyond Convolutions

The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module. Our graph module has three components: a) a knowledge graph where we represent classes as nodes and build edges to encode different types of semantic relationships between them; b) a region graph of the current image where regions in the image are nodes and spatial relationships between these regions are edges; c) an assignment graph that assigns regions to classes. Both the local module and the global module roll-out iteratively and cross-feed predictions to each other to refine estimates. The final predictions are made by combining the best of both modules with an attention mechanism. We show strong performance over plain ConvNets, \eg achieving an 8.4% absolute improvement on ADE measured by per-class average precision. Analysis also shows that the framework is resilient to missing regions for reasoning.

https://arxiv.org/abs/1805.08136v1 Meta-learning with differentiable closed-form solvers

In this work we propose to use these fast convergent methods as the main adaptation mechanism for few-shot learning. The main idea is to teach a deep network to use standard machine learning tools, such as logistic regression, as part of its own internal model, enabling it to quickly adapt to novel tasks. This requires back-propagating errors through the solver steps.