https://arxiv.org/pdf/1703.04474v1.pdf DRAGNN: A Transition-based Framework for Dynamically Connected Neural Networks

Our basic module is a new generic unit, the Transition Based Recurrent Unit (TBRU). In addition to hidden layer activations, TBRUs have discrete state dynamics that allow network connections to be built dynamically as a function of intermediate activations. By connecting multiple TBRUs, we can extend and combine commonly used architectures such as sequence-tosequence, attention mechanisms, and recursive tree-structured models. A TBRU can also serve as both an encoder for downstream tasks and as a decoder for its own task simultaneously, resulting in more accurate multi-task learning. We call our approach Dynamic Recurrent Acyclic Graphical Neural Networks, or DRAGNN.

https://arxiv.org/abs/1706.09067 Structured Recommendation

https://arxiv.org/pdf/1707.09627.pdf Learning to Infer Graphics Programs from Hand-Drawn Images

We introduce a model that learns to convert simple hand drawings into graphics programs written in a subset of LATEX. The model combines techniques from deep learning and program synthesis. We learn a convolutional neural network that proposes plausible drawing primitives that explain an image. This set of drawing primitives is like an execution trace for a graphics program. From this trace we use program synthesis techniques to recover a graphics program with constructs such as variable bindings, iterative loops, or simple kinds of conditionals. With a graphics program in hand, we can correct errors made by the deep network, cluster drawings by use of similar high-level geometric structures, and extrapolate drawings. Taken together these results are a step towards agents that induce useful, human-readable programs from perceptual input.

https://arxiv.org/abs/1808.07535 Learning Hierarchical Semantic Image Manipulation through Structured Representations

In this paper, we presented a hierarchical framework for semantic image manipulation. We first learn to generate the pixel-wise semantic label maps given the initial object bounding boxes. Then we learn to generate the manipulated image from the predicted label maps. Such framework allows the user to manipulate images at object-level by adding, removing, and moving an object bounding box at a time. Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework over existing image generation and context hole-filing models, both qualitatively and quantitatively. We further demonstrate its practical benefits in semantic object manipulation, interactive image editing and data-driven image editing. Future research directions include preserving the object identity and providing affordance as additional user input during image manipulation.

https://arxiv.org/abs/1810.01868v1 Deep processing of structured data

We construct a general unified framework for learning representation of structured data, i.e. data which cannot be represented as the fixed-length vectors (e.g. sets, graphs, texts or images of varying sizes). The key factor is played by an intermediate network called SAN (Set Aggregating Network), which maps a structured object to a fixed length vector in a high dimensional latent space. Our main theoretical result shows that for sufficiently large dimension of the latent space, SAN is capable of learning a unique representation for every input example. Experiments demonstrate that replacing pooling operation by SAN in convolutional networks leads to better results in classifying images with different sizes. Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models. https://github.com/gmum/

http://papers.nips.cc/paper/7287-structure-aware-convolutional-neural-networks