Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
structured_prediction [2018/08/30 21:23]
admin
structured_prediction [2018/12/04 14:35] (current)
admin
Line 22: Line 22:
 https://​arxiv.org/​abs/​1808.07535 Learning Hierarchical Semantic Image Manipulation through Structured Representations https://​arxiv.org/​abs/​1808.07535 Learning Hierarchical Semantic Image Manipulation through Structured Representations
  
-Understanding,​ reasoning, and manipulating semantic concepts of images have been a fundamental research problem for decades. Previous work mainly focused on direct manipulation on natural image manifold through color strokes, key-points, textures, and holes-to-fill. ​In this work, we present ​novel hierarchical framework for semantic image manipulation. ​Key to our hierarchical framework is that we employ a structured semantic layout as our intermediate representation for manipulation. Initialized with coarse-level bounding boxes, our structure generator ​first creates ​pixel-wise semantic ​layout capturing ​the object ​shape, object-object interactions,​ and object-scene relations. Then our image generator fills in the pixel-level textures guided by the semantic layout. Such framework allows ​user to manipulate images at object-level by adding, removing, and moving ​one bounding box at a time. Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework over existing image generation and context hole-filing models, both qualitatively and quantitatively. ​Benefits of the hierarchical framework are further ​demonstrated ​in applications such as semantic object manipulation,​ interactive image editingand data-driven image manipulation.+In this paper, we presented ​a hierarchical framework for semantic image manipulation. ​We first learn 
 +to generate the pixel-wise semantic ​label maps given the initial ​object ​bounding boxes. Then we learn 
 +to generate the manipulated ​image from the predicted label maps. Such framework allows ​the user to 
 +manipulate images at object-level by adding, removing, and moving ​an object ​bounding box at a time. 
 +Experimental evaluations demonstrate the advantages of the hierarchical manipulation framework 
 +over existing image generation and context hole-filing models, both qualitatively and quantitatively. 
 +We further ​demonstrate its practical benefits ​in semantic object manipulation,​ interactive image 
 +editing and data-driven ​image editing. Future research directions include preserving the object 
 +identity and providing affordance as additional user input during ​image manipulation. 
 + 
 +https://​arxiv.org/​abs/​1810.01868v1 Deep processing of structured data 
 + 
 +We construct a general unified framework for learning representation of structured data, i.e. data which cannot be represented as the fixed-length vectors (e.g. sets, graphs, texts or images of varying sizes). The key factor is played by an intermediate network called SAN (Set Aggregating Network), which maps a structured object to a fixed length vector in a high dimensional latent space. Our main theoretical result shows that for sufficiently large dimension of the latent space, SAN is capable of learning a unique representation for every input example. Experiments demonstrate that replacing pooling operation by SAN in convolutional networks leads to better results in classifying images with different sizes. Moreover, its direct application to text and graph data allows to obtain results close to SOTA, by simpler networks with smaller number of parameters than competitive models. https://​github.com/​gmum/​ 
 + 
 +http://​papers.nips.cc/​paper/​7287-structure-aware-convolutional-neural-networks ​