**Name** Autoencoder

**Intent**

Train , via unsupervised learning, a model that is able to recreate the original input representation.

**Motivation**

How can a train model without requiring the training data to be labeled?

**Sketch**

<Diagram>

**Discussion**

**Known Uses**

Variational Autoencoder

**Related Patterns**

<Diagram>

**References**

http://arxiv.org/abs/1206.5533

http://www.deeplearningbook.org/contents/autoencoders.html

http://www.deeplearningbook.org/contents/representation.html 15.1 Greedy Layer-Wise Unsupervised Pretraining Unsupervised learning played a key historical role in the revival of deep neural networks, allowing for the ﬁrst time to train a deep supervised network without requiring architectural specializations like convolution or recurrence. We call this procedure unsupervised pretraining, or more precisely, greedy layer-wise unsuper-vised pretraining. This procedure is a canonical example of how a representation learned for one task (unsupervised learning, trying to capture the shape of theinput distribution) can sometimes be useful for another task (supervised learning with the same input domain

http://arxiv.org/pdf/1603.06653v1.pdf Information Theoretic-Learning Auto-Encoder

Information-theoretic learning (ITL) is a field at the intersection of machine learning and information theory which encompasses a family of algorithms that compute and optimize informationtheoretic descriptors such as entropy, divergence, and mutual information. ITL objectives are computed directly from samples (non-parametrically) using Parzen windowing and Renyi’s entropy.

https://ayearofai.com/lenny-2-autoencoders-and-word-embeddings-oh-my-576403b0113a#.7n4vk8us5

http://arxiv.org/pdf/1606.04934v1.pdf Improving Variational Inference with Inverse Autoregressive Flow

https://arxiv.org/pdf/1611.09842v1.pdf Split-Brain Autoencoders: Unsupervised Learning by Cross-Channel Prediction

The method adds a split to the network, resulting in two disjoint sub-networks. Each sub-network is trained to perform a difficult task – predicting one subset of the data channels from another. Together, the sub-networks extract features from the entire input signal. By forcing the network to solve crosschannel prediction tasks, we induce a representation within the network which transfers well to other, unseen tasks.

The proposed method solves some of the weaknesses of previous self-supervised methods. Specifi- cally, the method (i) does not require a representational bottleneck for training, (ii) uses input dropout to help force abstraction in the representation, and (iii) is pre-trained on full images, and thus able to extract features from the full input data.

https://arxiv.org/abs/1506.02351v8 Stacked What-Where Auto-encoders

We present a novel architecture, the “stacked what-where auto-encoders” (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training.

The overall system, which can be seen as pairing a Convnet with a Deconvnet, yields good accuracy on a variety of semi-supervised and supervised tasks.

https://thecuriousaicompany.com/connection-to-g/ Learning by Denoising Part 1: What and why of denoising

https://arxiv.org/pdf/1710.10368v1.pdf DEEP GENERATIVE DUAL MEMORY NETWORK FOR CONTINUAL LEARNING

https://arxiv.org/abs/1712.07788v2 Deep Unsupervised Clustering Using Mixture of Autoencoders

https://openreview.net/forum?id=HkL7n1-0b Wasserstein Auto-Encoders

https://arxiv.org/abs/1805.09804v1 Implicit Autoencoders

In this paper, we describe the “implicit autoencoder” (IAE), a generative autoencoder in which both the generative path and the recognition path are parametrized by implicit distributions. We use two generative adversarial networks to define the reconstruction and the regularization cost functions of the implicit autoencoder, and derive the learning rules based on maximum-likelihood learning. Using implicit distributions allows us to learn more expressive posterior and conditional likelihood distributions for the autoencoder. Learning an expressive conditional likelihood distribution enables the latent code to only capture the abstract and high-level information of the data, while the remaining information is captured by the implicit conditional likelihood distribution. For example, we show that implicit autoencoders can disentangle the global and local information, and perform deterministic or stochastic reconstructions of the images. We further show that implicit autoencoders can disentangle discrete underlying factors of variation from the continuous factors in an unsupervised fashion, and perform clustering and semi-supervised learning.

https://arxiv.org/abs/1806.08462v1 Probabilistic Natural Language Generation with Wasserstein Autoencoders

https://colinraffel.com/publications/arxiv2018understanding.pdf https://colinraffel.com/talks/vector2018few.pdf