http://www.cs.toronto.edu/~fritz/absps/transauto6.pdf Transforming Auto-encoders
https://www.youtube.com/watch?v=TFIMqt0yT2I Geoffrey Hinton: “Does the Brain do Inverse Graphics?”
http://cseweb.ucsd.edu/~gary/cs200/s12/Hinton.pdf Does the Brain do Inverse Graphics?
https://arxiv.org/abs/1503.03167 Deep Convolutional Inverse Graphics Network
http://willwhitney.com/dc-ign/www/ https://github.com/willwhitney/dc-ign
https://www.cs.toronto.edu/~hinton/csc2535/notes/lec6b.pdf Taking Inverse Graphics Seriously
https://arxiv.org/pdf/1406.6901.pdf
https://www.vicarious.com/img/icml2017-schemas.pdf Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
In pursuit of efficient and robust generalization, we introduce the Schema Network, an objectoriented generative physics simulator capable of disentangling multiple causes of events and reasoning backward through causes to achieve goals. The richly structured architecture of the Schema Network can learn the dynamics of an environment directly from data. We compare Schema Networks with Asynchronous Advantage Actor-Critic and Progressive Networks on a suite of Breakout variations, reporting results on training efficiency and zero-shot generalization, consistently demonstrating faster, more robust learning and better transfer.
https://arxiv.org/abs/1801.05091v1 Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
We propose a novel hierarchical approach for text-to-image synthesis by inferring semantic layout. Instead of learning a direct mapping from text to image, our algorithm decomposes the generation process into multiple steps, in which it first constructs a semantic layout from the text by the layout generator and converts the layout to an image by the image generator. The proposed layout generator progressively constructs a semantic layout in a coarse-to-fine manner by generating object bounding boxes and refining each box by estimating object shapes inside the box. The image generator synthesizes an image conditioned on the inferred semantic layout, which provides a useful semantic structure of an image matching with the text description. Our model not only generates semantically more meaningful images, but also allows automatic annotation of generated images and user-controlled generation process by modifying the generated scene layout. We demonstrate the capability of the proposed model on challenging MS-COCO dataset and show that the model can substantially improve the image quality, interpretability of output and semantic alignment to input text over existing approaches.
https://arxiv.org/abs/1801.09597v1 Deep Reinforcement Learning using Capsules in Advanced Game Environments
This thesis introduces the use of CapsNet for Q-Learning based game algorithms. To successfully apply CapsNet in advanced game play, three main contributions follow. First, the introduction of four new game environments as frameworks for RL research with increasing complexity, namely Flash RL, Deep Line Wars, Deep RTS, and Deep Maze. These environments fill the gap between relatively simple and more complex game environments available for RL research and are in the thesis used to test and explore the CapsNet behavior. Second, the thesis introduces a generative modeling approach to produce artificial training data for use in Deep Learning models including CapsNets. We empirically show that conditional generative modeling can successfully generate game data of sufficient quality to train a Deep Q-Network well. Third, we show that CapsNet is a reliable architecture for Deep Q-Learning based algorithms for game AI. A capsule is a group of neurons that determine the presence of objects in the data and is in the literature shown to increase the robustness of training and predictions while lowering the amount training data needed. It should, therefore, be ideally suited for game plays.
https://arxiv.org/pdf/1805.03551v2.pdf A Unified Framework of Deep Neural Networks by Capsules This capsule framework could not only simplify the description of existing DNNs, but also provide a theoretical basis of graphical designing and programming for new deep learning models. As future work, we will try to define an industrial standard and implement a graphic platform for the advancement of deep learning with capsule networks, and even with a similar extension to recurrent neural networks.
https://arxiv.org/pdf/1804.10172.pdf Capsule networks for low-data transfer learning
The generative capsule network uses what we call a memo architecture, which consists of convolving the images into the Digit Capsules, applying convolutional reconstruction, and classifying images based on the reconstruction
https://arxiv.org/abs/1805.08090v1 Graph Capsule Convolutional Neural Networks
https://arxiv.org/abs/1805.07242 Siamese Capsule Networks
https://github.com/yash-1995-2006/Conditional-and-nonConditional-Capsule-GANs/ Conditional-and-nonConditional-Capsule-GANs
https://github.com/XifengGuo/CapsNet-Keras
https://arxiv.org/abs/1810.05315v1 A Context-aware Capsule Network for Multi-label Classification
We introduce, (1) a novel routing weight initialization technique, (2) an improved CapsNet design that exploits semantic relationships between the primary capsule activations using a densely connected Conditional Random Field and (3) a Cholesky transformation based correlation module to learn a general priority scheme. Our proposed design allows CapsNet to scale better to more complex problems, such as the multi-label classification task, where semantically related categories co-exist with various interdependencies.
https://arxiv.org/abs/1811.06969v1 DARCCC: Detecting Adversaries by Reconstruction from Class Conditional Capsules
https://arxiv.org/abs/1812.09707v1 Training Deep Capsule Networks
To ensure that all active capsules form a parse tree, we introduce a new routing algorithm called dynamic deep routing. We show that this routing algorithm allows the training of deeper capsule networks and is also more robust to white box adversarial attacks than the original routing algorithm.