Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
structured_prediction [2017/08/03 10:35]
127.0.0.1 external edit
structured_prediction [2018/12/04 14:35] (current)
admin
Line 19: Line 19:
 together these results are a step towards agents that induce useful, human-readable together these results are a step towards agents that induce useful, human-readable
 programs from perceptual input. programs from perceptual input.
 +
 +https://​arxiv.org/​abs/​1808.07535 Learning Hierarchical Semantic Image Manipulation through Structured Representations
 +
 +In this paper, we presented a hierarchical framework for semantic image manipulation. We first learn
 +to generate the pixel-wise semantic label maps given the initial object bounding boxes. Then we learn
 +to generate the manipulated image from the predicted label maps. Such framework allows the user to
 +manipulate images at object-level by adding, removing, and moving an object bounding box at a time.
 +Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework
 +over existing image generation and context hole-filing models, both qualitatively and quantitatively.
 +We further demonstrate its practical benefits in semantic object manipulation,​ interactive image
 +editing and data-driven image editing. Future research directions include preserving the object
 +identity and providing affordance as additional user input during image manipulation.
 +
 +https://​arxiv.org/​abs/​1810.01868v1 Deep processing of structured data
 +
 +We construct a general unified framework for learning representation of structured data, i.e. data which cannot be represented as the fixed-length vectors (e.g. sets, graphs, texts or images of varying sizes). The key factor is played by an intermediate network called SAN (Set Aggregating Network), which maps a structured object to a fixed length vector in a high dimensional latent space. Our main theoretical result shows that for sufficiently large dimension of the latent space, SAN is capable of learning a unique representation for every input example. Experiments demonstrate that replacing pooling operation by SAN in convolutional networks leads to better results in classifying images with different sizes. Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models. https://​github.com/​gmum/​
 +
 +http://​papers.nips.cc/​paper/​7287-structure-aware-convolutional-neural-networks ​