All we need to do is build a shift-reduce parser that combines vector representations rather than subtrees. This system is a pretty simple extension of the original shift-reduce setup:

Shift pulls the next word embedding from the buffer and pushes it onto the stack. Reduce combines the top two elements of the stack $(\vec c_1, \vec c_2)$ into a single element $(\vec p)$ via the standard recursive neural network feedforward: $$\vec p = \sigma(W [\vec c_1, \vec c_2])$$.

Now we have a shift-reduce parser, deep-learning style. Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks

The only underlying LSTM structure that has been explored so far is a linear chain. However, natural language exhibits syntactic properties that would naturally combine words to phrases. We introduce the Tree-LSTM, a generalization of LSTMs to tree-structured network topologies. Enhancing and Combining Sequential and Tree LSTM for Natural Language Inference

incorporating syntactic parse information contributes to our best result; it improves the performance even when the parse information is added to an already very strong system. Learning to Compose Words into Sentences with Reinforcement Learning

We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences. In contrast with prior work on tree-structured models in which the trees are either provided as input or predicted using supervision from explicit treebank annotations, the tree structures in this work are optimized to improve performance on a downstream task. Experiments demonstrate the benefit of learning task-specific composition orders, outperforming both sequential encoders and recursive encoders based on treebank annotations. We analyze the induced trees and show that while they discover some linguistically intuitive structures (e.g., noun phrases, simple verb phrases), they are different than conventional English syntactic structures Neural Tree Indexers for Text Understanding

. In this paper, we introduce a robust syntactic parsing-independent tree structured model, Neural Tree Indexers (NTI) that provides a middle ground between the sequential RNNs and the syntactic treebased recursive models. NTI constructs a full n-ary tree by processing the input text with its node function in a bottom-up fashion. Attention mechanism can then be applied to both structure and node function. We implemented and evaluated a binarytree model of NTI, showing the model achieved the state-of-the-art performance on three different NLP tasks: natural language inference, answer sentence selection, and sentence classification, outperforming state-of-the-art recurrent and recursive neural networks. Learning Continuous Semantic Representations of Symbolic Expressions

Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of mathematical and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures. From visual words to a visual grammar: using language modelling for image classification

The Bag–of–Visual–Words (BoVW) is a visual description technique that aims at shortening the semantic gap by partitioning a low–level feature space into regions of the feature space that potentially correspond to visual concepts and by giving more value to this space. In this paper we present a conceptual analysis of three major properties of language grammar and how they can be adapted to the computer vision and image understanding domain based on the bag of visual words paradigm. Enhanced LSTM for Natural Language Inference

In this paper, we present a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset. Unlike the previous top models that use very complicated network architectures, we first demonstrate that carefully designing sequential inference models based on chain LSTMs can outperform all previous models. Based on this, we further show that by explicitly considering recursive architectures in both local inference modeling and inference composition, we achieve additional improvement. Particularly, incorporating syntactic parsing information contributes to our best result—it further improves the performance even when added to the already very strong model. Robust Incremental Neural Semantic Graph Parsing Analogs of Linguistic Structure in Deep Representations

By comparing truth-conditional representations of encoder-produced message vectors to human-produced referring expressions, we are able to identify aligned (vector, utterance) pairs with the same meaning. We then search for structured relationships among these aligned pairs to discover simple vector space transformations corresponding to negation, conjunction, and disjunction. Our results suggest that neural representations are capable of spontaneously developing a “syntax” with functional analogues to qualitative properties of natural language. Neural Network Dependency Parser Dynamic Compositional Neural Networks over Tree Structure

most existing models suffer from the underfitting problem: they recursively use the same shared compositional function throughout the whole compositional process and lack expressive power due to inability to capture the richness of compositionality. In this paper, we address this issue by introducing the dynamic compositional neural networks over tree structure (DC-TreeNN), in which the compositional function is dynamically generated by a meta network. The role of metanetwork is to capture the metaknowledge across the different compositional rules and formulate them. Experimental results on two typical tasks show the effectiveness of the proposed models. SLING: A framework for frame semantic parsing

We describe SLING, a framework for parsing natural language into semantic frames. SLING supports general transition-based, neural-network parsing with bidirectional LSTM input encoding and a Transition Based Recurrent Unit (TBRU) for output decoding. The parsing model is trained end-to-end using only the text tokens as input. The transition system has been designed to output frame graphs directly without any intervening symbolic representation. The SLING framework includes an efficient and scalable frame store implementation as well as a neural network JIT compiler for fast inference during parsing. Generative Adversarial Networks for Model Based Reinforcement Learning with Tree Search StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing Embedding Grammars

n this paper, we blend the structure of standard context-free grammars with the semantic generalization capabilities of word embeddings to create hybrid semantic grammars. These semantic grammars generalize the specific terminals used by the programmer to other words and phrases with related meanings, allowing the construction of compact grammars that match an entire region of the vector space rather than matching specific elements. Bridging Knowledge Gaps in Neural Entailment via Symbolic Models

We focus on filling these knowledge gaps in the Science Entailment task, by leveraging an external structured knowledge base (KB) of science facts. Our new architecture combines standard neural entailment models with a knowledge lookup module. To facilitate this lookup, we propose a fact-level decomposition of the hypothesis, and verifying the resulting sub-facts against both the textual premise and the structured KB. Our model, NSnet, learns to aggregate predictions from these heterogeneous data formats. Grammar Induction with Neural Language Models: An Unusual Replication

We find that this model represents the first empirical success for latent tree learning, and that neural network language modeling warrants further study as a setting for grammar induction. On Tree-Based Neural Sentence Modeling

In this work, we propose to empirically investigate what contributes mostly in the tree-based neural sentence encoding. We find that trivial trees without syntax surprisingly give better results, compared to the syntax tree and the latent tree. Further analysis indicates that the balanced tree gains from its shallow and balance properties compared to other trees, and right-branching tree benefits from its strong structural prior under the setting of left-to-right decoder. Finite Automata Can be Linearly Decoded from Language-Recognizing RNNs